[1]
|
E. Alpaydin, Introduction to Machine Learning, MIT press, Cambridge, 2020.
|
[2]
|
A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Commun. ACM, 60 (2012), 84–90. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
|
[3]
|
A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 542 (2017), 115–118. https://doi.org/10.1038/nature21056 doi: 10.1038/nature21056
|
[4]
|
J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2019), 4171–4186. https://doi.org/10.18653/v1/n19-1423
|
[5]
|
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, et al., Human-level control through deep reinforcement learning, Nature, 518 (2015), 529–533. https://doi.org/10.1038/nature14236 doi: 10.1038/nature14236
|
[6]
|
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner et al., An image is worth 16x16 words: Transformers for image recognition at scale, in International Conference on Learning Representations, 2021.
|
[7]
|
C. C. Chiu, C. Raffel, Monotonic chunkwise attention, in International Conference on Learning Representations, 2018.
|
[8]
|
X. He, K. Zhao, X. Chu, AutoML: survey of the state-of-the-art, Knowl.-Based Syst., 212 (2021), 106622. https://doi.org/10.1016/j.knosys.2020.106622 doi: 10.1016/j.knosys.2020.106622
|
[9]
|
D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980.
|
[10]
|
J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization, J. Mach. Learn. Res., 13 (2012), 281–305.
|
[11]
|
K. Hussain, M. N. Mohd Salleh, S. Cheng, Y. Shi, Metaheuristic research: a comprehensive survey, Artif. Intell. Rev., 52 (2018), 2191–2233. https://doi.org/10.1007/s10462-017-9605-z doi: 10.1007/s10462-017-9605-z
|
[12]
|
I. Boussaïd, J. Lepagnot, P. Siarry, Survey on optimization metaheuristics, Inf. Sci., 237 (2013), 82–117. https://doi.org/10.1016/j.ins.2013.02.041 doi: 10.1016/j.ins.2013.02.041
|
[13]
|
J. Snoek, H. Larochelle, R. P. Adams, Practical bayesian optimization of machine learning algorithms, Adv. Neural Inform. Process. Syst., 25 (2012).
|
[14]
|
B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, N. de Freitas, Taking the human out of the loop: A review of bayesian optimization, Proc. IEEE, 104 (2016), 148–175. https://doi.org/10.1109/jproc.2015.2494218 doi: 10.1109/jproc.2015.2494218
|
[15]
|
H. Cai, C. Gan, T. Wang, Z. Zhang, S. Han, Once-for-all: Train one network and specialize it for efficient deployment, in International Conference on Learning Representations, 2020.
|
[16]
|
S. Adriaensen, A. Biedenkapp, G. Shala, N. Awad, T. Eimer, M. Lindauer, et al., Automated dynamic algorithm configuration, J. Artif. Intell. Res., 75 (2022), 1633–1699. https://doi.org/10.1613/jair.1.13922 doi: 10.1613/jair.1.13922
|
[17]
|
M. Donini, L. Franceschi, O. Majumder, M. Pontil, P. Frasconi, Marthe: scheduling the learning rate via online hypergradients, in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, (2021), 2119–2125. https://doi.org/10.24963/ijcai.2020/293
|
[18]
|
J. Parker-Holder, V. Nguyen, S. J. Roberts, Provably efficient online hyperparameter optimization with population-based bandits, Adv. Neural Inform. Process. Syst., 33 (2020), 17200–17211.
|
[19]
|
A. Biedenkapp, H. F. Bozkurt, T. Eimer, F. Hutter, M. T. Lindauer, Dynamic algorithm configuration: Foundation of a new meta-algorithmic framework, in the 24th European Conference on Artificial Intelligence, (2020), 427–434. https://doi.org/10.3233/FAIA200122
|
[20]
|
F. Karl, T. Pielok, J. Moosbauer, F. Pfisterer, S. Coors, M. Binder, et al., Multi-objective hyperparameter optimization in machine learning–An overview, in ACM Transactions on Evolutionary Learning and Optimization, 3 (2023), 1–50. https://doi.org/10.1145/3610536
|
[21]
|
A. Morales-Hernández, I. Van Nieuwenhuyse, S. Rojas Gonzalez, A survey on multi-objective hyperparameter optimization algorithms for machine learning, Artif. Intell. Rev., 56 (2023), 8043–8093. https://doi.org/10.1007/s10462-022-10359-2 doi: 10.1007/s10462-022-10359-2
|
[22]
|
B. Bischl, M. Binder, M. Lang, T. Pielok, J. Richter, S. Coors, et al., Hyperparameter optimization: Foundations, algorithms, best practices and open challenges, WIRES Data Min. Knowl., 13 (2023). https://doi.org/10.1002/widm.1484 doi: 10.1002/widm.1484
|
[23]
|
T. Yu, H. Zhu, Hyper-parameter optimization: A review of algorithms and applications, preprint, arXiv: 2003.05689.
|
[24]
|
L. Yang, A. Shami, On hyperparameter optimization of machine learning algorithms: Theory and practice, Neurocomputing, 415 (2020), 295–316. https://doi.org/10.1016/j.neucom.2020.07.061 doi: 10.1016/j.neucom.2020.07.061
|
[25]
|
R. Mohakud, R. Dash, Survey on hyperparameter optimization using nature-inspired algorithm of deep convolution neural network, in Intelligent and Cloud Computing, (2020), 737–744. https://doi.org/10.1007/978-981-15-5971-6_77
|
[26]
|
N. Del Buono, F. Esposito, L. Selicato, Methods for hyperparameters optimization in learning approaches: An overview, in Machine Learning, Optimization, and Data Science, (2020), 100–112. https://doi.org/10.1007/978-3-030-64583-0_11
|
[27]
|
M. Feurer, F. Hutter, Hyperparameter Optimization, in Automated Machine Learning, (2019), 3–33. https://doi.org/10.1007/978-3-030-05318-5_1
|
[28]
|
X. Wang, Y. Jin, S. Schmitt, M. Olhofer, Recent Advances in Bayesian Optimization, ACM Comput. Surv., 55 (2023), 1–36. https://doi.org/10.1145/3582078 doi: 10.1145/3582078
|
[29]
|
P. I. Frazier, A tutorial on Bayesian optimization, preprint, arXiv: 1807.02811.
|
[30]
|
E. Brochu, V. M. Cora, N. De Freitas, A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, preprint, arXiv: 1012.2599.
|
[31]
|
R. E. Shawi, M. Maher, S. Sakr, Automated machine learning: State-of-the-art and open challenges, preprint, arXiv: 1906.02287.
|
[32]
|
M. A. Zöller, M. F. Huber, Benchmark and survey of automated machine learning frameworks, J. Artif. Intell. Res., 70 (2021), 409–472. https://doi.org/10.1613/jair.1.11854 doi: 10.1613/jair.1.11854
|
[33]
|
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, et al., Backpropagation applied to handwritten zip code recognition, Neural Comput., 1 (1989), 541–551. https://doi.org/10.1162/neco.1989.1.4.541 doi: 10.1162/neco.1989.1.4.541
|
[34]
|
Y. Bengio, Gradient-based optimization of hyperparameters, Neural Comput., 12 (2000), 1889–1900. https://doi.org/10.1162/089976600300015187 doi: 10.1162/089976600300015187
|
[35]
|
J. Domke, Generic methods for optimization-based modeling, in Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, (2012), 318–326.
|
[36]
|
D. Maclaurin, D. Duvenaud, R. Adams, Gradient-based hyperparameter optimization through reversible learning, in Proceedings of the 32nd International Conference on Machine Learning, (2015), 2113–2122.
|
[37]
|
F. Pedregosa, Hyperparameter optimization with approximate gradient, in Proceedings of The 33rd International Conference on Machine Learning, (2016), 737–746.
|
[38]
|
L. Franceschi, M. Donini, P. Frasconi, M. Pontil, Forward and reverse gradient-based hyperparameter optimization, in Proceedings of the 34th International Conference on Machine Learning, (2017), 1165–1173.
|
[39]
|
J. Lorraine, P. Vicol, D. Duvenaud, Optimizing millions of hyperparameters by implicit differentiation, in Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, (2020), 1540–1552.
|
[40]
|
C. W. Hsu, C. C. Chang, C. J. Lin, A Practical Guide to Support Vector Classification, 2003. Available from: http://www.csie.ntu.edu.tw/cjlin/papers/guide/guide.pdf
|
[41]
|
C. Audet, J. E. Dennis, Mesh adaptive direct search algorithms for constrained optimization, SIAM J. Optim., 17 (2006), 188–217. https://doi.org/10.1137/040603371 doi: 10.1137/040603371
|
[42]
|
G. E. Dahl, T. N. Sainath, G. E. Hinton, Improving deep neural networks for lvcsr using rectified linear units and dropout, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, (2013), 8609–8613. https://doi.org/10.1109/ICASSP.2013.6639346
|
[43]
|
Y. Chen, A. Huang, Z. Wang, I. Antonoglou, J. Schrittwieser, D. Silver, et al., Bayesian optimization in alphago, preprint, arXiv: 1812.06855.
|
[44]
|
C. E. Rasmussen, C. K. I. Williams, Gaussian Processes for Machine Learning, The MIT Press, Cambridge, 2005. https://doi.org/10.7551/mitpress/3206.001.0001
|
[45]
|
D. R. Jones, M. Schonlau, W. J. Welch, Efficient global optimization of expensive black-box functions, J. Glob. Optim., 13 (1998), 455–492. https://doi.org/10.1023/A:1008306431147 doi: 10.1023/A:1008306431147
|
[46]
|
K. Swersky, D. Duvenaud, J. Snoek, F. Hutter, M. A. Osborne, Raiders of the lost architecture: Kernels for bayesian optimization in conditional parameter spaces, preprint, arXiv: 1409.4011.
|
[47]
|
E. Snelson, Z. Ghahramani, Sparse gaussian processes using pseudo-inputs, Adv. Neural Inform. Process. Syst., 18 (2006), 1259–1266.
|
[48]
|
C. Oh, E. Gavves, M. Welling, Bock: Bayesian optimization with cylindrical kernels, in Proceedings of the 35th International Conference on Machine Learning, (2018), 3868–3877.
|
[49]
|
K. Kandasamy, J. Schneider, B. Póczos, High dimensional bayesian optimisation and bandits via additive models, in Proceedings of the 32nd International Conference on Machine Learning, (2015), 295–304.
|
[50]
|
F. Hutter, H. H. Hoos, K. Leyton-Brown, Sequential model-based optimization for general algorithm configuration, in Learning and Intelligent Optimization, Springer, (2011), 507–523. https://doi.org/10.1007/978-3-642-25566-3_40
|
[51]
|
J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization, Adv. Neural Inform. Process. Syst., 24 (2011), 2546–2554.
|
[52]
|
K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, et al., Towards an empirical foundation for assessing bayesian optimization of hyperparameters, in NIPS workshop on Bayesian Optimization in Theory and Practice, (2013).
|
[53]
|
S. Falkner, A. Klein, F. Hutter, Bohb: Robust and efficient hyperparameter optimization at scale, in Bohb: Robust and efficient hyperparameter optimization at scale, (2018), 1437–1446.
|
[54]
|
E. Goan, C. Fookes, Bayesian neural networks: An introduction and survey, in Case Studies in Applied Bayesian Data Science, Springer, (2020), 45–87. https://doi.org/10.1007/978-3-030-42553-1_3
|
[55]
|
J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, et al., Scalable bayesian optimization using deep neural networks, in Proceedings of the 32nd International Conference on Machine Learning, 37 (2015), 2171–2180.
|
[56]
|
J. T. Springenberg, A. Klein, S. Falkner, F. Hutter, Bayesian optimization with robust bayesian neural networks, Adv. Neural Inf. Process. Syst., 29 (2016), 4134–4142.
|
[57]
|
T. Chen, E. B. Fox, C. Guestrin, Stochastic gradient hamiltonian monte carlo, in Proceedings of the 31st International Conference on Machine Learning, 32 (2014), 1683–1691.
|
[58]
|
N. Srinivas, A. Krause, S. M. Kakade and M. W. Seeger, Gaussian process optimization in the bandit setting: No regret and experimental design, in Proceedings of the 27th International Conference on Machine Learning. Omnipress, (2010), 1015–1022.
|
[59]
|
P. Hennig, C. J. Schuler, Entropy search for information-efficient global optimization, J. Mach. Learn. Res. (JMLR), 13 (2012), 1809–1837.
|
[60]
|
J. M. Hernández-Lobato, M. W. Hoffman and Z. Ghahramani, Predictive entropy search for efficient global optimization of black-box functions, Adv. Neural Inform. Process. Syst., 27 (2014), 918–926.
|
[61]
|
Z. Wang, S. Jegelka, Max-value entropy search for efficient bayesian optimization, in Proceedings of the 34th International Conference on Machine Learning, (2017), 3627–3635.
|
[62]
|
M. Jaderberg, V. Dalibard, S. Osindero, W. M. Czarnecki, J. Donahue, A. Razavi, et al., Population based training of neural networks, preprint, arXiv: 1711.09846.
|
[63]
|
N. A. Vien, H. Zimmermann, M. Toussaint, Bayesian functional optimization, in Proceedings of the AAAI Conference on Artificial Intelligence, (2018). https://doi.org/10.1609/aaai.v32i1.11830
|
[64]
|
P. F. Jian Wu, Practical two-step lookahead bayesian optimization, Adv. Neural Inform. Process. Syst., 32 (2019), 9813–9823.
|
[65]
|
J. Kirschner, M. Mutný, N. Hiller, R. Ischebeck, A. Krause, Adaptive and safe bayesian optimization in high dimensions via one-dimensional subspaces, in Proceedings of the 36th International Conference on Machine Learning, (2019), 3429–3438.
|
[66]
|
D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, M. Poloczek, Scalable global optimization via local Bayesian optimization, Adv. Neural Inf. Process. Syst., 32 (2019), 5497–5508.
|
[67]
|
V. Nguyen, M. A. Osborne, Knowing the what but not the where in bayesian optimization, Proceedings of the 37th International Conference on Machine Learning, (2020), 7317–7326.
|
[68]
|
E. A. Daxberger, A. Makarova, M. Turchetta, A. Krause, Mixed-variable bayesian optimization, in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), (2020), 2633–2639. https://doi.org/10.24963/ijcai.2020/365
|
[69]
|
A. Souza, L. Nardi, L. B. Oliveira, K. Olukotun, M. Lindauer, F. Hutter, Bayesian optimization with a prior for the optimum, in Machine Learning and Knowledge Discovery in Databases. Research Track, Springer, (2021), 265–296. https://doi.org/10.1007/978-3-030-86523-8_17
|
[70]
|
C. Hvarfner, D. Stoll, A. Souza, L. Nardi, M. Lindauer, F. Hutter, πBO: Augmenting acquisition functions with user beliefs for bayesian optimization, in International Conference on Learning Representations, 2022.
|
[71]
|
N. Mallik, E. Bergman, C. Hvarfner, D. Stoll, M. Janowski, M. Lindauer, et al. Priorband: Practical hyperparameter optimization in the age of deep learning, Adv. Neural Inform. Process. Syst., (2024).
|
[72]
|
S. Katoch, S. S. Chauhan, V. Kumar, A review on genetic algorithm: past, present, and future, Multimed. Tools Appl., 80 (2021), 8091–8126. https://doi.org/10.1007/s11042-020-10139-6 doi: 10.1007/s11042-020-10139-6
|
[73]
|
C. A. C. Coello, G. B. Lamont, D. A. Van Veldhuizen, Evolutionary algorithms for solving multi-objective problems, Springer, New York, 2007. https://doi.org/10.1007/978-0-387-36797-2
|
[74]
|
J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, The MIT Press, Cambridge, 1992. https://doi.org/10.7551/mitpress/1090.001.0001
|
[75]
|
T. Blickle, L. Thiele, A comparison of selection schemes used in evolutionary algorithms, Evol. Comput., 4 (1996), 361–394. https://doi.org/10.1162/evco.1996.4.4.361 doi: 10.1162/evco.1996.4.4.361
|
[76]
|
T. Bäck, Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms, Oxford University Press, Oxford, 1996. https://doi.org/10.1093/oso/9780195099713.001.0001
|
[77]
|
I. Rechenberg, Optimierung technischer Systeme nach Prinzipien der biologischen Evolution, PhD thesis, Technische Universität, Fakultät für Maschinenwissenschaft, 1970.
|
[78]
|
H. P. Schwefel, G. Rudolph, Contemporary evolution strategies, in Advances in Artificial Life, Springer, (1995), 891–907. https://doi.org/10.1007/3-540-59496-5_351
|
[79]
|
R. Li, M. T. Emmerich, J. Eggermont, T. Bäck, M. Schütz, J. Dijkstra, et al., Mixed integer evolution strategies for parameter optimization, Evol. Comput., 21 (2013), 29–64. https://doi.org/10.1162/evco_a_00059 doi: 10.1162/evco_a_00059
|
[80]
|
N. Hansen, A. Ostermeier, A. Gawelczyk, On the adaptation of arbitrary normal mutation distributions in evolution strategies: The generating set adaptation, in Proceedings of the Sixth International Conference on Genetic Algorithms, (1995), 57–64.
|
[81]
|
R. Storn, K. V. Price, Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces, J. Glob. Optim., 11 (1997), 341–359. https://doi.org/10.1023/A:1008202821328 doi: 10.1023/A:1008202821328
|
[82]
|
S. Saremi, S. M. Mirjalili, A. Lewis, Grasshopper optimisation algorithm: Theory and application, Adv. Eng. Softw., 105 (2017), 30–47. https://doi.org/10.1016/j.advengsoft.2017.01.004 doi: 10.1016/j.advengsoft.2017.01.004
|
[83]
|
E. H. Houssein, A. G. Gad, K. Hussain, P. N. Suganthan, Major advances in particle swarm optimization: Theory, analysis, and application, Swarm Evol. Comput., 63 (2021), 100868. https://doi.org/10.1016/j.swevo.2021.100868 doi: 10.1016/j.swevo.2021.100868
|
[84]
|
J. Kennedy, R. Eberhart, Particle swarm optimization, in Proceedings of ICNN'95 - International Conference on Neural Networks, (1995), 1942–1948. https://doi.org/10.1109/ICNN.1995.488968
|
[85]
|
Y. Shi, R. Eberhart, A modified particle swarm optimizer, in 1998 IEEE International Conference on Evolutionary Computation Proceedings, (1998), 69–73. https://doi.org/10.1109/ICEC.1998.699146
|
[86]
|
R. Turner, D. Eriksson, M. McCourt, J. Kiili, E. Laaksonen, Z. Xu, et al., Bayesian optimization is superior to random search for machine learning hyperparameter tuning: analysis of the black-box optimization challenge 2020, in Proceedings of the NeurIPS 2020 Competition and Demonstration Track, (2021), 3–26.
|
[87]
|
H. G. Beyer, H. P. Schwefel, Evolution strategies–a comprehensive introduction, Nat. Comput., 1 (2002), 3–52. https://doi.org/10.1023/A:1015059928466 doi: 10.1023/A:1015059928466
|
[88]
|
K. Kandasamy, G. Dasarathy, J. B. Oliva, J. G. Schneider. B. Póczos, Gaussian process bandit optimisation with multi-fidelity evaluations, Adv. Neural Inform. Process. Syst., 29 (2016).
|
[89]
|
A. Klein, S. Falkner, S. Bartels, P. Hennig, F. Hutter, Fast bayesian optimization of machine learning hyperparameters on large datasets, in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, (2017), 528–536.
|
[90]
|
M. Poloczek, J. Wang, P. Frazier, Multi-information source optimization, Multi-information source optimization, 30 (2017), 4288–4298.
|
[91]
|
J. Wu, S. Toscano-Palmerin, P. I. Frazier, A. G. Wilson, Practical multi-fidelity bayesian optimization for hyperparameter tuning, in Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, (2020), 788–798.
|
[92]
|
S. Takeno, H. Fukuoka, Y. Tsukada, T. Koyama, M. Shiga, I. Takeuchi, et al., Multi-fidelity bayesian optimization with max-value entropy search and its parallelization, in Proceedings of the 37th International Conference on Machine Learning, (2020), 9334–9345.
|
[93]
|
K. Swersky, J. Snoek, R. P. Adams, Multi-task bayesian optimization, Adv. Neural Inform. Process. Syst., 26 (2013).
|
[94]
|
M. Feurer, J. T. Springenberg, F. Hutter, Initializing bayesian hyperparameter optimization via meta-learning, AAAI Conf. Artif. Intell., 29 (2015), 1128–1135. https://doi.org/10.1609/aaai.v29i1.9354 doi: 10.1609/aaai.v29i1.9354
|
[95]
|
V. Perrone, R. Jenatton, M. W. Seeger, C. Archambeau, Scalable hyperparameter transfer learning, Adv. Neural Inform. Process. Syst., 31 (2018).
|
[96]
|
M. Nomura, S. Watanabe, Y. Akimoto, Y. Ozaki, M. Onishi, Warm starting CMA-ES for hyperparameter optimization, AAAI Conf. Artif. Intell., 35 (2021), 9188–9196. https://doi.org/10.1609/aaai.v35i10.17109 doi: 10.1609/aaai.v35i10.17109
|
[97]
|
K. G. Jamieson, A. S. Talwalkar, Non-stochastic best arm identification and hyperparameter optimization, in Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, (2016), 240–248.
|
[98]
|
L. Li, K. G. Jamieson, G. DeSalvo, A. Rostamizadeh, A. S. Talwalkar, Hyperband: A novel bandit-based approach to hyperparameter optimization, J. Mach. Learn. Res. (JMLR), 18 (2018), 1–52.
|
[99]
|
L. Li, K. Jamieson, A. Rostamizadeh, E. Gonina, J. Ben-Tzur, M. Hardt, et al., A system for massively parallel hyperparameter tuning, Proc. Mach. Learn. Syst., (2020), 230–246.
|
[100]
|
G. Mittal, C. Liu, N. Karianakis, V. Fragoso, M. Chen, Y. R. Fu, Hyperstar: Task-aware hyperparameters for deep networks, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 8733–8742. https://doi.org/10.1109/cvpr42600.2020.00876
|
[101]
|
N. H. Awad, N. Mallik, F. Hutter, DEHB: Evolutionary hyberband for scalable, robust and efficient hyperparameter optimization, in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, (2021), 2147–2153. https://doi.org/10.24963/ijcai.2021/296
|
[102]
|
K. Swersky, J. Snoek, R. P. Adams, Freeze-thaw bayesian optimization, preprint, arXiv: 1406.3896.
|
[103]
|
T. Domhan, J. T. Springenberg, F. Hutter, Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves, in Twenty-fourth International Joint Conference on Artificial Intelligence, (2015), 3460–3468.
|
[104]
|
A. Klein, S. Falkner, J. T. Springenberg, F. Hutter, Learning curve prediction with bayesian neural networks, in International Conference on Learning Representations, 2017.
|
[105]
|
B. Baker, O. Gupta, R. Raskar, N. Naik, Accelerating neural architecture search using performance prediction, preprint, arXiv: 1705.10823.
|
[106]
|
Z. Dai, H. Yu, K. H. Low, P. Jaillet, Bayesian optimization meets bayesian optimal stopping, in Proceedings of the 36th International Conference on Machine Learning, (2019), 1496–1506.
|
[107]
|
V. Nguyen, S. Schulze, M. Osborne, Bayesian optimization for iterative learning, Adv. Neural Inform. Process. Syst., 33 (2020), 9361–9371.
|
[108]
|
A. Makarova, H. Shen, V. Perrone, A. Klein, J. B. Faddoul, A. Krause, et al., Automatic termination for hyperparameter optimization, in Proceedings of the First International Conference on Automated Machine Learning, 2022.
|
[109]
|
A. G. Baydin, R. Cornish, D. M. Rubio, M. Schmidt, F. Wood, Online learning rate adaptation with hypergradient descent, in International Conference on Learning Representations, 2018.
|
[110]
|
Y. Wu, M. Ren, R. Liao, R. Grosse, Understanding short-horizon bias in stochastic meta-optimization, in International Conference on Learning Representations, 2018.
|
[111]
|
J. Li, B. Gu, H. Huang, A fully single loop algorithm for bilevel optimization without hessian inverse, in Proceedings of the 36th AAAI Conference on Artificial Intelligence, (2022), 7426–7434. https://doi.org/10.1609/aaai.v36i7.20706
|
[112]
|
Z. Tao, Y. Li, B. Ding, C. Zhang, J. Zhou, Y. R. Fu, Learning to mutate with hypergradient guided population, Adv. Neural Inform. Process. Syst., 33 (2020), 17641–17651.
|
[113]
|
J. Parker-Holder, V. Nguyen, S. Desai, S. J. Roberts, Tuning mixed input hyperparameters on the fly for efficient population based autorl, Adv. Neural Inform. Process. Syst., 34 (2021).
|
[114]
|
X. Wan, C. Lu, J. Parker-Holder, P. J. Ball, V. Nguyen, B. Ru, et al., Bayesian generational population-based training, in Proceedings of the First International Conference on Automated Machine Learning, (2022), 1–27.
|
[115]
|
R. S. Sutton, A. G. Barto, Reinforcement learning: An introduction, 2nd edition, MIT press, Cambridge, 2018.
|
[116]
|
H. S. Jomaa, J. Grabocka, L. Schmidt-Thieme, Hyp-rl: Hyperparameter optimization by reinforcement learning, preprint, arXiv: 1906.11527.
|
[117]
|
S. Paul, V. Kurin, S. Whiteson, Fast efficient hyperparameter tuning for policy gradient methods, in Advances in Neural Information Processing Systems, 32 (2019), 4616–4626.
|
[118]
|
B. Doerr, C. Doerr, Theory of parameter control for discrete black-box optimization: provable performance gains through dynamic parameter choices, in Theory of Evolutionary Computation: Recent Developments in Discrete Optimization, Springer, Cham, (2020), 271–321. https://doi.org/10.1007/978-3-030-29414-4_6
|
[119]
|
W. B. Powell, Reinforcement Learning and Stochastic Optimization: A unified framework for sequential decisions, John Wiley & Sons, Hoboken, 2022. https://doi.org/10.1002/9781119815068
|
[120]
|
J. Parker-Holder, R. Rajan, X. Song, A. Biedenkapp, Y. Miao, T. Eimer, et al., Automated reinforcement learning (AutoRL): a survey and open problems, J. Artif. Intell. Res., 74 (2022), 517–568. http://doi.org/10.1613/jair.1.13596 doi: 10.1613/jair.1.13596
|
[121]
|
R. R. Afshar, Y. Zhang, J. Vanschoren and U. Kaymak, Automated reinforcement learning: An overview, preprint, arXiv: 2201.05000.
|
[122]
|
L. Engstrom, A. Ilyas, S. Santurkar, D. Tsipras, F. Janoos, L. Rudolph, et al., Implementation matters in deep RL: A case study on PPO and TRPO, in International Conference on Learning Representations, 2020.
|
[123]
|
M. Andrychowicz, A. Raichuk, P. Stańczyk, M. Orsini, S. Girgin, R. Marninier, et al., What matters for on-policy deep actor-critic methods? a large-scale study, in International Conference on Learning Representations, 2021.
|
[124]
|
B. Zhang, R. Rajan, L. Pineda, N. Lambert, A. Biedenkapp, K. Chua, et al., On the importance of hyperparameter optimization for model-based reinforcement learning, in Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, (2021), 4015–4023.
|
[125]
|
M. Igl, G. Farquhar, J. Luketina, W. Boehmer, S. Whiteson, Transient non-stationarity and generalisation in deep reinforcement learning, in International Conference on Learning Representations, 2021.
|
[126]
|
Y. Jin, T. Zhou, L. Zhao, Y. Zhu, C. Guo, M. Canini, et al., AutoLRS: Automatic learning-rate schedule by bayesian optimization on the fly, in International Conference on Learning Representations, 2020.
|
[127]
|
J. Sun, Y. Yang, G. Xun, A. Zhang, A stagewise hyperparameter scheduler to improve generalization, in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (2021), 1530–1540. https://doi.org/10.1145/3447548.3467287
|
[128]
|
Y. Jin, Multi-objective machine learning, Springer, Berlin, 2006. https://doi.org/10.1007/11399346
|
[129]
|
K. Deb, Multi-objective optimisation using evolutionary algorithms: an introduction, in Multi-objective Evolutionary Optimisation for Product Design and Manufacturing, Springer, London, (2011), 3–34. https://doi.org/10.1007/978-0-85729-652-8_1
|
[130]
|
M. Parsa, A. Ankit, A. Ziabari, K. Roy, PABO: pseudo agent-based multi-objective bayesian hyperparameter optimization for efficient neural accelerator design, in 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), (2019), 1–8. https://doi.org/10.1109/iccad45719.2019.8942046
|
[131]
|
R. Schmucker, M. Donini, V. Perrone, C. Archambeau, Multi-objective multi-fidelity hyperparameter optimization with application to fairness, in NeurIPS 2020 Workshop on Meta-learning, 2020.
|
[132]
|
K. Miettinen, Nonlinear multiobjective optimization, Springer Science & Business Media, New York, 1999.
|
[133]
|
K. Miettinen and M. M. Mäkelä, On scalarizing functions in multiobjective optimization, OR Spectrum, 24 (2002), 193–213. https://doi.org/10.1007/s00291-001-0092-9 doi: 10.1007/s00291-001-0092-9
|
[134]
|
T. Chugh, Scalarizing functions in Bayesian multiobjective Optimization, in 2020 IEEE Congress on Evolutionary Computation (CEC), (2020), 1–8. https://doi.org/10.1109/CEC48606.2020.9185706
|
[135]
|
Y. Y. Haimes, L. S. Lasdon, D. A. Wismer, On a bicriterion cormulation of the problems of integrated system identification and system optimization, IEEE Trans. Syst. Man Cybern., (1971), 296–297. https://doi.org/10.1109/tsmc.1971.4308298 doi: 10.1109/tsmc.1971.4308298
|
[136]
|
K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-Ⅱ, IEEE T. Evolut. Comput., 6 (2002), 182–197. https://doi.org/10.1109/4235.996017 doi: 10.1109/4235.996017
|
[137]
|
N. Srinivas, K. Deb, Muiltiobjective optimization using nondominated sorting in genetic algorithms, Evol. Comput., 2 (1994), 221–248. https://doi.org/10.1162/evco.1994.2.3.221 doi: 10.1162/evco.1994.2.3.221
|
[138]
|
K. Deb, H. Jain, An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part Ⅰ: Solving problems with box constraints, IEEE T. Evolut. Comput., 18 (2014), 577–601. https://doi.org/10.1109/tevc.2013.2281535 doi: 10.1109/tevc.2013.2281535
|
[139]
|
K. Deb, H. Jain, An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part Ⅱ: handling constraints and extending to an adaptive approach, IEEE T. Evolut. Comput., 18 (2014), 602–622. https://doi.org/10.1109/tevc.2013.2281534 doi: 10.1109/tevc.2013.2281534
|
[140]
|
E. Zitzler, M. Laumanns, L. Thiele, SPEA2: Improving the strength Pareto evolutionary algorithm, TIK Report, 103 (2001), 1–21. https://doi.org/10.3929/ethz-a-004284029 doi: 10.3929/ethz-a-004284029
|
[141]
|
Q. Zhang, H. Li, MOEA/D: a multiobjective evolutionary algorithm based on decomposition, IEEE T. Evolut. Comput., 11 (2007), 712–731. https://doi.org/10.1109/tevc.2007.892759 doi: 10.1109/tevc.2007.892759
|
[142]
|
E. Zitzler, L. Thiele, Multiobjective optimization using evolutionary algorithms — A comparative case study, in Parallel Problem Solving from Nature—PPSN V, Springer, (1998), 292–301. https://doi.org/10.1007/BFb0056872
|
[143]
|
M. Emmerich, N. Beume, B. Naujoks, An EMO algorithm using the hypervolume measure as selection criterion, in Evolutionary Multi-Criterion Optimization, Springer, (2005), 62–76. https://doi.org/10.1007/978-3-540-31880-4_5
|
[144]
|
X. Li, A non-dominated sorting particle swarm optimizer for multiobjective optimization, in Genetic and Evolutionary Computation—GECCO 2003, Springer, (2003), 37–48. https://doi.org/10.1007/3-540-45105-6_4
|
[145]
|
C. A. C. Coello, G. T. Pulido, M. S. Lechuga, Handling multiple objectives with particle swarm optimization, IEEE T. Evolut. Comput., 8 (2004), 256–279. https://doi.org/10.1109/tevc.2004.826067 doi: 10.1109/tevc.2004.826067
|
[146]
|
J. Knowles, ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems, IEEE T. Evolut. Comput., 10 (2006), 50–66. https://doi.org/10.1109/tevc.2005.851274 doi: 10.1109/tevc.2005.851274
|
[147]
|
W. Ponweiset, T. Wagner, D. Biermann, M. Vincze, Multiobjective optimization on a limited dudget of evaluations using model-assisted S-metric selection, in Parallel Problem Solving from Nature – PPSN X, Springer, (2008), 784–794. https://doi.org/10.1007/978-3-540-87700-4_78
|
[148]
|
M. T. M. Emmerich, K. C. Giannakoglou, B. Naujoks, Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels, IEEE T. Evolut. Comput., 10 (2006), 421–439. https://doi.org/10.1109/tevc.2005.859463 doi: 10.1109/tevc.2005.859463
|
[149]
|
Y. Jin, Surrogate-assisted evolutionary computation: Recent advances and future challenges, Swarm Evol. Comput., 1 (2011), 61–70. https://doi.org/10.1016/j.swevo.2011.05.001 doi: 10.1016/j.swevo.2011.05.001
|
[150]
|
S. Daulton, M. Balandat, E. Bakshy, Differentiable expected hypervolume improvement for parallel multi-objective Bayesian optimization, Adv. Neural Inform. Process. Syst., 33 (2020), 9851–9864
|
[151]
|
D. Hernández-Lobato, J. Hernández-Lobato, A. Shah, R. Adams, Predictive entropy search for multi-objective bayesian optimization, in Proceedings of The 33rd International Conference on Machine Learning, (2016), 1492–1501.
|
[152]
|
S. Belakaria, A. Deshwal, J. R. Doppa, Max-value entropy search for multi-objective bayesian optimization, Adv. Neural Inform. Process. Syst., 32 (2019).
|
[153]
|
S. Daulton, M. Balandat, E. Bakshy, Parallel bayesian optimization of multiple noisy objectives with expected hypervolume improvement, Adv. Neural Inform. Process. Syst., 34 (2021), 2187–2200.
|
[154]
|
Z. J. Lin, R. Astudillo, P. Frazier, E. Bakshy, Preference exploration for efficient bayesian optimization with multiple outcomes, in Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, (2022), 4235–4258.
|
[155]
|
G. Misitano, B. Afsar, G. Lárrage, K. Miettinen, Towards explainable interactive multiobjective optimization: R-XIMO, Auton. Agent. Multi-Agent Syst., 36 (2022), 43. http://doi.org/10.1007/s10458-022-09577-3 doi: 10.1007/s10458-022-09577-3
|
[156]
|
G. Malkomes, B. Cheng, E. H. Lee, M. Mccourt, Beyond the pareto efficient frontier: constraint active search for multiobjective experimental design, in Proceedings of the 38th International Conference on Machine Learning, (2021), 7423–7434.
|
[157]
|
Z. Chen, Y. Zhou, Z. Huang, X. Xia, Towards efficient multiobjective hyperparameter optimization: a multiobjective multi-fidelity bayesian optimization and hyperband algorithm, in Parallel Problem Solving from Nature–PPSN XVII, Springer, (2022), 160–174. http://doi.org/10.1007/978-3-031-14714-2_12
|
[158]
|
A. Dushatskiy, A. Chebykin, T. Alderliesten, P. A. N. Bosman, Multi-objective population based training, in Proceedings of the 40th International Conference on Machine Learning, (2023), 8969–8989.
|
[159]
|
R. Schmucker, M. Donini, M. B. Zafar, D. Salinas and C. Archambeau, Multi-objective asynchronous successive halving, preprint, arXiv: 2106.12639.
|
[160]
|
F. Hutter, L. Kotthoff, J. Vanschoren, Automated Machine Learning: Methods, Systems, Challenges, Springer, Cham, 2019. https://doi.org/10.1007/978-3-030-05318-5
|
[161]
|
T. Akiba, S. Sano, T. Yanase, T. Ohta, M. Koyama, Optuna: a next-generation hyperparameter optimization framework, in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (2019), 2623–2631. https://doi.org/10.1145/3292500.3330701
|
[162]
|
R. Liaw, E. Liang, R. Nishihara, P. Moritz, J. E. Gonzalez, I. Stoica, Tune: a research platform for distributed model Selection and Training, preprint, arXiv: 1807.05118.
|
[163]
|
M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, et al., BoTorch: a framework for efficient monte-carlo bayesian optimization, Adv. Neural Inform. Process. Syst., 33 (2020).
|
[164]
|
J. Bergstra, B. Komer, C. Eliasmith, D. Yamins, D. D. Cox, Hyperopt: a python library for model selection and hyperparameter optimization, Comput. Sci. Discov., 8 (2015), 014008. https://doi.org/10.1088/1749-4699/8/1/014008 doi: 10.1088/1749-4699/8/1/014008
|
[165]
|
M. Lindauer, K. Eggensperger, M. Feurer, A. Biedenkapp, D. Deng, C. Benjamins, et al., SMAC3: A versatile bayesian optimization package for hyperparameter optimization, J. Mach. Learn. Res., 23 (2022), 1–9.
|
[166]
|
F. A. Fortin, F. M. De Rainville, M. A. G. Gardner, M. Parizeau, C. Gagné, DEAP: Evolutionary algorithms made easy, J. Mach. Learn. Res., 13 (2012), 2171–2175.
|
[167]
|
R. Martinez-Cantin, Bayesopt: a bayesian optimization library for nonlinear optimization, experimental design and bandits, J. Mach. Learn. Res., 15 (2014), 3735–3739.
|
[168]
|
L. Nardi, A. Souza, D. Koeplinger, K. Olukotun, HyperMapper: a practical design space exploration framework, in 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), (2019), 425–426. https://doi.org/10.1109/MASCOTS.2019.00053
|
[169]
|
M. Lang, M. Binder, J. Richter, P. Schratz, F. Pfisterer, S. Coors, et al., mlr3: A modern object-oriented machine learning framework in R, J. Open Source Softw., 4 (2019), 1903. https://doi.org/10.21105/joss.01903 doi: 10.21105/joss.01903
|
[170]
|
B. Bischl, R. Sonabend, L. Kotthoff, M. Lang, Applied Machine Learning Using mlr3 in R, Chapman and Hall/CRC, New York, 2023. https://doi.org/10.1201/9781003402848
|
[171]
|
A. Benítez-Hidalgo, A. J. Nebro, J. García-Nieto, I. Oregi and J. Del Ser, jMetalPy: A Python framework for multi-objective optimization with metaheuristics, Swarm Evol. Comput., 51 (2019), 100598. https://doi.org/10.1016/j.swevo.2019.100598 doi: 10.1016/j.swevo.2019.100598
|
[172]
|
N. E. Toklu, T. Atkinson, V. Micka, P. Liskowski and R. K. Srivastava, EvoTorch: Scalable Evolutionary Computation in Python, preprint, arXiv: 2302.12600.
|
[173]
|
Y. Li, Y. Shen, W. Zhang, Y. Chen, H. Jiang, M. Liu, et al., OpenBox: a generalized black-box optimization service, in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (2021), 3209–3219. https://doi.org/10.1145/3447548.3467061
|
[174]
|
K. Kandasamy, K. R. Vysyaraju, W. Neiswanger, B. Paria, C. R. Collins, J. Schneider, et al., Tuning hyperparameters without grad students: Scalable and robust bayesian optimisation with dragonfly, J. Mach. Learn. Res., 21 (2020), 1–27.
|
[175]
|
D. Salinas, M. Seeger, A. Klein, V. Perrone, M. Wistuba, C. Archambeau, Syne Tune: a library for large scale hyperparameter tuning and reproducible research, in Proceedings of the First International Conference on Automated Machine Learning, (2022), 1–23.
|
[176]
|
J. George, C. Gao, R. Liu, H. G. Liu, Y. Tang, R. Pydipaty, et al., A scalable and cloud-native hyperparameter tuning system, preprint, arXiv: 2006.02085.
|
[177]
|
O. Taubert, M. Weiel, D. Coquelin, A. Farshian, C. Debus, A. Schug, et al., Massively parallel genetic optimization through asynchronous propagation of populations, in High Performance Computing, Springer, (2023), 106–124. https://doi.org/10.1007/978-3-031-32041-5_6
|
[178]
|
J. Blank, K. Deb, Pymoo: multi-objective optimization in Python, IEEE Access, 8 (2020), 89497–89509. http://doi.org/10.1109/access.2020.2990567 doi: 10.1109/access.2020.2990567
|
[179]
|
S. S. Sandha, M. Aggarwal, I. Fedorov, M. Srivastava, Mango: A Python Library for Parallel Hyperparameter Tuning, in ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2020), 3987–3991. https://doi.org/10.1109/icassp40776.2020.9054609
|
[180]
|
L. Hertel, J. Collado, P. Sadowski, J. Ott, P. Baldi, Sherpa: Robust hyperparameter optimization for machine learning, SoftwareX, 12 (2020), 100591. https://doi.org/10.1016/j.softx.2020.100591 doi: 10.1016/j.softx.2020.100591
|
[181]
|
N. O. Nikitin, P. Vychuzhanin, M. Sarafanov, I. S. Polonskaia, I. Revin, I. V. Barabanova, et al., Automated evolutionary approach for the design of composite machine learning pipelines, Future Gener. Comput. Syst., 127 (2022), 109–125. https://doi.org/10.1016/j.future.2021.08.022 doi: 10.1016/j.future.2021.08.022
|
[182]
|
I. S. Polonskaia, N. O. Nikitin, I. Revin, P. Vychuzhanin, A. V. Kalyuzhnaya, Multi-objective evolutionary design of composite data-driven model, in 2021 IEEE Congress on Evolutionary Computation (CEC), (2021), 926–933. https://doi.org/10.1109/CEC45853.2021.9504773
|
[183]
|
R. S. Olson, J. H. Moore, Tpot: A tree-based pipeline optimization tool for automating machine learning, in Proceedings of the Workshop on Automatic Machine Learning, 2016, 66–74.
|
[184]
|
C. Guan, Z. Zhang, H. Li, H. Chang, Z. Zhang, Y. Qin, et al., AutoGL: A library for automated graph learning, in ICLR 2021 Workshop on Geometrical and Topological Representation Learning, 2021.
|
[185]
|
M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, F. Hutter, Efficient and robust automated machine learning, Adv. Neural Inform. Process. Syst., 28 (2015).
|
[186]
|
M. Feurer, K. Eggensperger, S. Falkner, M. Lindauer, F. Hutter, Auto-sklearn 2.0: Hands-free automl via meta-learning, J. Mach. Learn. Res., 23 (2022), 1–61.
|
[187]
|
L. Zimmer, M. Lindauer, F. Hutter, Auto-Pytorch: Multi-fidelity metaLearning for efficient and robust autoDL, IEEE T. Pattern Anal., 43 (2021), 3079–3090. https://doi.org/10.1109/tpami.2021.3067763 doi: 10.1109/tpami.2021.3067763
|
[188]
|
H. Jin, Q. Song, X. Hu, Auto-Keras: an efficient neural architecture search system, in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (2019), 1946–1956. https://doi.org/10.1145/3292500.3330648
|
[189]
|
H. Jin, F. Chollet, Q. Song, X. Hu, AutoKeras: An autoML library for deep learning, J. Mach. Learn. Res., 24 (2023), 1–6.
|
[190]
|
N. Erickson, J. Mueller, A. Shirkov, H. Zhang, P. Larroy, M. Li, et al., AutoGluon-Tabular: robust and accurate autoML for structured data, preprint, arXiv: 2003.06505.
|
[191]
|
C. Wang, Q. Wu, M. Weimer, E. Zhu, FLAML: a fast and lightweight autoML library, in Proceedings of Machine Learning and Systems 3 (MLSys 2021), 2021.
|
[192]
|
N. Fusi, R. Sheth, M. Elibol, Probabilistic matrix factorization for automated machine learning, Adv. Neural Inform. Process. Syst., 31 (2018), 3166–3180.
|
[193]
|
A. Yakovlev, H. F. Moghadam, A. Moharrer, J. Cai, N. Chavoshi, V. Varadarajan, et al., Oracle AutoML: a fast and predictive AutoML pipeline, in Proc. VLDB Endow., 13 (2020), 3166–3180. https://doi.org/10.14778/3415478.3415542
|
[194]
|
D. Golovin, B. Solnik, S. Moitra, G. Kochanski, J. Karro, D. Sculley, Google vizier: A service for black-box optimization, in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2017), 1487–1495. https://doi.org/10.1145/3097983.3098043
|
[195]
|
E. Libery, Z. Karning, B. Xiang, L. Rouesnel, B. Coskun, R. Nallapati et al., Elastic machine learning algorithms in Amazon SageMaker, in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (2020), 731–737. https://doi.org/10.1145/3318464.3386126
|
[196]
|
S. Blume, T. Benedens, D. Schramm, Hyperparameter optimization techniques for designing software sensors based on artificial neural networks, Sensors, 21 (2021), 8435. https://doi.org/10.3390/s21248435 doi: 10.3390/s21248435
|
[197]
|
C. Cooney, A. Korik, R. Folli, D. Coyle, Evaluation of hyperparameter optimization in machine and deep learning methods for decoding imagined speech EEG, Sensors, 20 (2020), 4629. https://doi.org/10.3390/s20164629 doi: 10.3390/s20164629
|
[198]
|
R. Khalida, N. Javaida, Survey on hyperparameters optimization algorithms of forecasting models in smart grid, Sustain. Cities Soc., 61 (2020), 102275. https://doi.org/10.1016/j.scs.2020.102275 doi: 10.1016/j.scs.2020.102275
|
[199]
|
R. Andonie, Hyperparameter optimization in learning systems, J. Membr. Comput., 1 (2019), 279–291. https://doi.org/10.1007/s41965-019-00023-0 doi: 10.1007/s41965-019-00023-0
|
[200]
|
G. Luo, A review of automatic selection methods for machine learning algorithms and hyper-parameter values, Netw. Model. Anal. Health Inform. Bioinform., 5 (2016), 1–16. https://doi.org/10.1007/s13721-016-0125-6 doi: 10.1007/s13721-016-0125-6
|
[201]
|
S. Stober, D. J. Cameron, J. A. Grahn, Using convolutional neural networks to recognize rhythm stimuli from electroencephalography recordings, Adv. Neural Inform. Process. Syst., 27 (2014), 1449–1457.
|
[202]
|
A. Drouin-Picaro, T. H. Falk, Using deep neural networks for natural saccade classification from electroencephalograms, in 2016 IEEE EMBS International Student Conference (ISC), 2016, 1–4. https://doi.org/10.1109/embsisc.2016.7508606
|
[203]
|
Z. Zhou, F. Xiong, B. Huang, C. Xu, R. Jiao, B. Liao, et al., Game-theoretical energy management for energy internet with big data-based renewable power forecasting, IEEE Access, 5 (2017), 5731–5746. https://doi.org/10.1109/access.2017.2658952 doi: 10.1109/access.2017.2658952
|
[204]
|
J. Waring, C. Lindvall, R. Umeton, Automated machine learning: Review of the state-of-the-art and opportunities for healthcare, Artif. Intell. Med., 104 (2020), 101822. https://doi.org/10.1016/j.artmed.2020.101822 doi: 10.1016/j.artmed.2020.101822
|
[205]
|
A. Alaa, M. Schaar, Autoprognosis: Automated clinical prognostic modeling via bayesian optimization with structured kernel learning, in Proceedings of the 35th International Conference on Machine Learning, (2018), 139–148.
|
[206]
|
I. Castiglioni, L. Rundo, M. Codari, G. Di Leo, C. Salvatore, M. Interlenghi, et al., AI applications to medical images: From machine learning to deep learning, Phys. Med., 83 (2021), 9–24. https://doi.org/10.1016/j.ejmp.2021.02.006 doi: 10.1016/j.ejmp.2021.02.006
|
[207]
|
M. Nishio, K. Fujimoto, K. Togashi, Lung segmentation on chest X-ray images in patients with severe abnormal findings using deep learning, Int. J. Imag. Syst. Tech., 31 (2021), 1002–1008. https://doi.org/10.1002/ima.22528 doi: 10.1002/ima.22528
|
[208]
|
A. Abdellatif, H. Abdellatef, J. Kanesan, C. O. Chow, J. H. Chuah, H. M. Gheni, An Effective Heart Disease Detection and Severity Level Classification Model Using Machine Learning and Hyperparameter Optimization Methods, IEEE Access, 10 (2022), 79974–79985. https://doi.org/10.1109/ACCESS.2022.3191669 doi: 10.1109/ACCESS.2022.3191669
|
[209]
|
D. M. Belete, M. D. Huchaiah, Grid search in hyperparameter optimization of ML models for prediction of HIV/AIDS test results, Int. J. Comput. Appl., 44 (2022), 875–886. https://doi.org/10.1080/1206212X.2021.1974663 doi: 10.1080/1206212X.2021.1974663
|
[210]
|
S. Nematzadeh, F. Kiana, M. Torkamanian-Afshar, N. Aydin, Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases, Comput. Biol. Chem., 97 (2022), 107619. https://doi.org/10.1016/j.compbiolchem.2021.107619 doi: 10.1016/j.compbiolchem.2021.107619
|
[211]
|
G. I. Diaz, A. Fokoue-Nkoutche, G. Nannicini, H. Samulowitz, An effective algorithm for hyperparameter optimization of neural networks, IBM J. Res. Dev., 61 (2017), 9:1–9:11. https://doi.org/10.1147/JRD.2017.2709578 doi: 10.1147/JRD.2017.2709578
|
[212]
|
D. Stamoulis, E. Cai, D. C. Juan, D. Marculescu, HyperPower: Power- and memory-constrained hyper-parameter optimization for neural networks, in 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), (2018), 19–24. https://doi.org/10.23919/DATE.2018.8341973
|
[213]
|
Z. Lu, L. Chen, C. K. Chiang, F. Sha Hyper-parameter Tuning under a Budget Constraint, in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, (2019), 5744–5750. https://doi.org/10.24963/ijcai.2019/796
|
[214]
|
C. Wang, H. Wang, C. Zhou, H. Chen, ExperienceThinking: Constrained hyperparameter optimization based on knowledge and pruning, Knowl.-Based Syst., 223 (2018), 106602. https://doi.org/10.1016/j.knosys.2020.106602 doi: 10.1016/j.knosys.2020.106602
|
[215]
|
B. Letham, B. Karrer, G. Ottoni, E. Bakshy, Constrained Bayesian Optimization with Noisy Experiments, Bayesian Anal., 14 (2019), 495–519. https://doi.org/10.1214/18-BA1110 doi: 10.1214/18-BA1110
|
[216]
|
T. P. Papalexopoulos, C. Tjandraatmadja, R. Anderson, J. P. Vielma, D. Belanger, Constrained discrete black-box optimization using mixed-integer programming, in Proceedings of the 39th International Conference on Machine Learning, 2022, 17295–17322.
|
[217]
|
F. Berkenkamp, A. Krause, A. P. Schoellig, Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics, Mach. Learn., 112 (2023), 3713–3747. https://doi.org/10.1007/s10994-021-06019-1 doi: 10.1007/s10994-021-06019-1
|
[218]
|
F. Wenzel, J. Snoek, D. Tran, R. Jenatton, Hyperparameter ensembles for robustness and uncertainty quantification, Adv. Neural Inform. Process. Syst., 33 (2020), 6514–6527.
|
[219]
|
F. Seifi, M. J. Azizi, S. T. A. Niaki, A data-driven robust optimization algorithm for black-box cases: An application to hyper-parameter optimization of machine learning algorithms, Comput. Ind. Eng., 160 (2021), 107581. https://doi.org/10.1016/j.cie.2021.107581 doi: 10.1016/j.cie.2021.107581
|
[220]
|
G. Sunder, T. A. Albrecht, C. J. Nachtsheim, Robust sequential experimental strategy for black-box optimization with application to hyperparameter tuning, Qual. Reliab. Eng. Int., 38 (2022), 3992–4014. https://doi.org/110.1002/qre.3181
|
[221]
|
L. Franceschi, P. Frasconi, S. Salzo, R. Grazzi, M. Pontil, Bilevel Programming for Hyperparameter Optimization and Meta-Learning, in Proceedings of the 35th International Conference on Machine Learning, (2018), 1568–1577.
|
[222]
|
O. Bohdal, Y. Yang, T. Hospedales, EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization, Adv. Neural Inform. Process. Syst., 34 (2021).
|
[223]
|
X. He, K. Zhao, X. Chu, AutoML: A survey of the state-of-the-art, Knowl.-Based Syst., 212 (2021), 106622. https://doi.org/10.1016/j.knosys.2020.106622 doi: 10.1016/j.knosys.2020.106622
|
[224]
|
A. M. Vincent, P. Jidesh, An improved hyperparameter optimization framework for AutoML systems using evolutionary algorithms, Sci. Rep., 13 (2023), 4737. https://doi.org/10.1038/s41598-023-32027-3 doi: 10.1038/s41598-023-32027-3
|
[225]
|
B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le, Learning transferable architectures for scalable image recognition, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 8697–8710. https://doi.org/10.1109/cvpr.2018.00907
|
[226]
|
E. Real, A. Aggarwal, Y. Huang, Q. V. Le, Regularized evolution for image classifier architecture search, in Proceedings of the AAAI Conference on Artificial Intelligence, (2019), 4780–4789. https://doi.org/10.1609/aaai.v33i01.33014780
|
[227]
|
C. White, W. Neiswanger, Y. Savani, BANANAS: Bayesian optimization with neural architectures for neural architecture search, in Proceedings of the AAAI Conference on Artificial Intelligence, (2021), 10293–10301. https://doi.org/10.1609/aaai.v35i12.17233
|
[228]
|
H. Liu, K. Simonyan, Y. Yang, DARTS: differentiable architecture search, in International Conference on Learning Representations, (2019).
|
[229]
|
X. Wang, C. Xue, J. Yan, X. Yang, Y. Hu, K. Sun, Mergenas: Merge operations into one for differentiable architecture search, in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, (2020), 3065–3072. https://doi.org/10.24963/ijcai.2020/424
|
[230]
|
X. Wang, W. Guo, J. Su, X. Yang, J. Yan, Zarts: On zero-order optimization for neural architecture search, Adv. Neural Inform. Process. Syst., 35 (2022).
|
[231]
|
Y. Chen, T. Yang, X. Zhang, G. Meng, X. Xiao, J. Sun, Detnas: Backbone search for object detection, Adv. Neural Inform. Process. Syst., 32 (2019), 6642–6652.
|
[232]
|
X. Wang, J. Lin, J. Zhao, X. Yang, J. Yan, Eautodet: Efficient architecture search for object detection, in Computer Vision–ECCV 2022, Springer, (2022), 668–684. https://doi.org/10.1007/978-3-031-20044-1_38
|
[233]
|
L. Chen, M. D. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, et al., Searching for efficient multi-scale architectures for dense image prediction, Adv. Neural Inform. Process. Syst., 31 (2018).
|
[234]
|
C. Liu, L. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, et al., Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 82–92. https://doi.org/10.1109/cvpr.2019.00017
|
[235]
|
X. Wang, Z. Lian, J. Lin, C. Xue, J. Yan, DIY your easynas for vision: Convolution operation merging, map channel reducing, and search space to supernet conversion tooling, IEEE T. Pattern Anal., 45 (2023), 13974–13990. https://doi.org/10.1109/tpami.2023.3298296 doi: 10.1109/tpami.2023.3298296
|
[236]
|
Y. Bengio, A. Lodi, A. Prouvost, Machine learning for combinatorial optimization: a methodological tour d'horizon, Eur. J. Oper. Res., 290 (2021), 405–421. https://doi.org/10.1016/j.ejor.2020.07.063 doi: 10.1016/j.ejor.2020.07.063
|
[237]
|
J. Yan, S. Yang, E. Hancock, Learning for graph matching and related combinatorial optimization problems, in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, (2020), 4988–4996. https://doi.org/10.24963/ijcai.2020/694
|
[238]
|
E. B. Khalil, H. Dai, Y. Zhang, B. Dilkina, L. Song, Learning combinatorial optimization algorithms over graphs, Adv. Neural Inform. Process. Syst., 30 (2017), 6351–6361.
|
[239]
|
M. Nazari, A. Oroojlooy, L. Snyder, M. Takác, Reinforcement learning for solving the vehicle routing problem, Adv. Neural Inform. Process. Syst., 31 (2018), 9839–9849.
|
[240]
|
C. Liu, Z. Dong, H. Ma, W. Luo, X. Li, B. Pang, et al., L2P-MIP: Learning to presolve for mixed integer programming, in The Twelfth International Conference on Learning Representations, (2024).
|
[241]
|
Y. Li, X. Chen, W. Guo, X. Li, W. Luo, J. Huang, et al., Hardsatgen: Understanding the difficulty of hard sat formula generation and a strong structure-hardness-aware baseline, in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (2023), 4414–4425. https://doi.org/10.1145/3580305.3599837
|
[242]
|
R. Wang, L. Shen, Y. Chen, X. Yang, D. Tao, J. Yan, Towards one-shot neural combinatorial solvers: Theoretical and empirical notes on the cardinality-constrained case, in The Eleventh International Conference on Learning Representations, (2022).
|
[243]
|
Q. Ren, Q. Bao, R. Wang, J. Yan, Appearance and structure aware robust deep visual graph matching: Attack, defense and beyond, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2022), 15242–15251. https://doi.org/10.1109/cvpr52688.2022.01483
|
[244]
|
J. Yan, M. Cho, H. Zha, X. Yang, S. M. Chu, Multi-graph matching via affinity optimization with graduated consistency regularization, IEEE T. Pattern Anal., 38 (2016), 1228–1242. https://doi.org/10.1109/tpami.2015.2477832 doi: 10.1109/tpami.2015.2477832
|
[245]
|
T. Wang, Z. Jiang, J. Yan, Multiple graph matching and clustering via decayed pairwise matching composition, in Proceedings of the AAAI Conference on Artificial Intelligence, (2020), 1660–1667. https://doi.org/10.1609/aaai.v34i02.5528
|
[246]
|
R. Wang, T. Zhang, T. Yu, J. Yan, X. Yang, Combinatorial learning of graph edit distance via dynamic embedding, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5237–5246. https://doi.org/10.1109/CVPR46437.2021.00520
|