Share to: share facebook share twitter share wa share telegram print page

Hyperparameter (machine learning)

In machine learning, a hyperparameter is a parameter that can be set in order to define any configurable part of a model's learning process. Hyperparameters can be classified as either model hyperparameters (such as the topology and size of a neural network) or algorithm hyperparameters (such as the learning rate and the batch size of an optimizer). These are named hyperparameters in contrast to parameters, which are characteristics that the model learns from the data.

Hyperparameters are not required by every model or algorithm. Some simple algorithms such as ordinary least squares regression require none. However, the LASSO algorithm, for example, adds a regularization hyperparameter to ordinary least squares which must be set before training.[1] Even models and algorithms without a strict requirement to define hyperparameters may not produce meaningful results if these are not carefully chosen. However, optimal values for hyperparameters are not always easy to predict. Some hyperparameters may have no meaningful effect, or one important variable may be conditional upon the value of another. Often a separate process of hyperparameter tuning is needed to find a suitable combination for the data and task.

As well was improving model performance, hyperparameters can be used to by researchers introduce robustness and reproducibility into their work, especially if it uses models that incorporate random number generation.

Considerations

The time required to train and test a model can depend upon the choice of its hyperparameters.[2] A hyperparameter is usually of continuous or integer type, leading to mixed-type optimization problems.[2] The existence of some hyperparameters is conditional upon the value of others, e.g. the size of each hidden layer in a neural network can be conditional upon the number of layers.[2]

Difficulty-learnable parameters

The objective function is typically non-differentiable with respect to hyperparameters.[clarification needed] As a result, in most instances, hyperparameters cannot be learned using gradient-based optimization methods (such as gradient descent), which are commonly employed to learn model parameters. These hyperparameters are those parameters describing a model representation that cannot be learned by common optimization methods, but nonetheless affect the loss function. An example would be the tolerance hyperparameter for errors in support vector machines.

Untrainable parameters

Sometimes, hyperparameters cannot be learned from the training data because they aggressively increase the capacity of a model and can push the loss function to an undesired minimum (overfitting to the data), as opposed to correctly mapping the richness of the structure in the data. For example, if we treat the degree of a polynomial equation fitting a regression model as a trainable parameter, the degree would increase until the model perfectly fit the data, yielding low training error, but poor generalization performance.

Tunability

Most performance variation can be attributed to just a few hyperparameters.[3][2][4] The tunability of an algorithm, hyperparameter, or interacting hyperparameters is a measure of how much performance can be gained by tuning it.[5] For an LSTM, while the learning rate followed by the network size are its most crucial hyperparameters,[6] batching and momentum have no significant effect on its performance.[7]

Although some research has advocated the use of mini-batch sizes in the thousands, other work has found the best performance with mini-batch sizes between 2 and 32.[8]

Robustness

An inherent stochasticity in learning directly implies that the empirical hyperparameter performance is not necessarily its true performance.[2] Methods that are not robust to simple changes in hyperparameters, random seeds, or even different implementations of the same algorithm cannot be integrated into mission critical control systems without significant simplification and robustification.[9]

Reinforcement learning algorithms, in particular, require measuring their performance over a large number of random seeds, and also measuring their sensitivity to choices of hyperparameters.[9] Their evaluation with a small number of random seeds does not capture performance adequately due to high variance.[9] Some reinforcement learning methods, e.g. DDPG (Deep Deterministic Policy Gradient), are more sensitive to hyperparameter choices than others.[9]

Optimization

Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given test data.[2] The objective function takes a tuple of hyperparameters and returns the associated loss.[2] Typically these methods are not gradient based, and instead apply concepts from derivative-free optimization or black box optimization.

Reproducibility

Apart from tuning hyperparameters, machine learning involves storing and organizing the parameters and results, and making sure they are reproducible.[10] In the absence of a robust infrastructure for this purpose, research code often evolves quickly and compromises essential aspects like bookkeeping and reproducibility.[11] Online collaboration platforms for machine learning go further by allowing scientists to automatically share, organize and discuss experiments, data, and algorithms.[12] Reproducibility can be particularly difficult for deep learning models.[13] For example, research has shown that deep learning models depend very heavily even on the random seed selection of the random number generator.[14]

See also

References

  1. ^ Yang, Li; Shami, Abdallah (2020-11-20). "On hyperparameter optimization of machine learning algorithms: Theory and practice". Neurocomputing. 415: 295–316. arXiv:2007.15745. doi:10.1016/j.neucom.2020.07.061. ISSN 0925-2312. S2CID 220919678.
  2. ^ a b c d e f g "Claesen, Marc, and Bart De Moor. "Hyperparameter Search in Machine Learning." arXiv preprint arXiv:1502.02127 (2015)". arXiv:1502.02127. Bibcode:2015arXiv150202127C.
  3. ^ Leyton-Brown, Kevin; Hoos, Holger; Hutter, Frank (January 27, 2014). "An Efficient Approach for Assessing Hyperparameter Importance": 754–762 – via proceedings.mlr.press. {{cite journal}}: Cite journal requires |journal= (help)
  4. ^ "van Rijn, Jan N., and Frank Hutter. "Hyperparameter Importance Across Datasets." arXiv preprint arXiv:1710.04725 (2017)". arXiv:1710.04725. Bibcode:2017arXiv171004725V.
  5. ^ "Probst, Philipp, Bernd Bischl, and Anne-Laure Boulesteix. "Tunability: Importance of Hyperparameters of Machine Learning Algorithms." arXiv preprint arXiv:1802.09596 (2018)". arXiv:1802.09596. Bibcode:2018arXiv180209596P.
  6. ^ Greff, K.; Srivastava, R. K.; Koutník, J.; Steunebrink, B. R.; Schmidhuber, J. (October 23, 2017). "LSTM: A Search Space Odyssey". IEEE Transactions on Neural Networks and Learning Systems. 28 (10): 2222–2232. arXiv:1503.04069. doi:10.1109/TNNLS.2016.2582924. PMID 27411231. S2CID 3356463.
  7. ^ "Breuel, Thomas M. "Benchmarking of LSTM networks." arXiv preprint arXiv:1508.02774 (2015)". arXiv:1508.02774. Bibcode:2015arXiv150802774B.
  8. ^ "Revisiting Small Batch Training for Deep Neural Networks (2018)". arXiv:1804.07612. Bibcode:2018arXiv180407612M.
  9. ^ a b c d "Mania, Horia, Aurelia Guy, and Benjamin Recht. "Simple random search provides a competitive approach to reinforcement learning." arXiv preprint arXiv:1803.07055 (2018)". arXiv:1803.07055. Bibcode:2018arXiv180307055M.
  10. ^ "Greff, Klaus, and Jürgen Schmidhuber. "Introducing Sacred: A Tool to Facilitate Reproducible Research."" (PDF). 2015.
  11. ^ "Greff, Klaus, et al. "The Sacred Infrastructure for Computational Research."" (PDF). 2017. Archived from the original (PDF) on 2020-09-29. Retrieved 2018-04-06.
  12. ^ "Vanschoren, Joaquin, et al. "OpenML: networked science in machine learning." arXiv preprint arXiv:1407.7722 (2014)". arXiv:1407.7722. Bibcode:2014arXiv1407.7722V.
  13. ^ Villa, Jennifer; Zimmerman, Yoav (25 May 2018). "Reproducibility in ML: why it matters and how to achieve it". Determined AI Blog. Retrieved 31 August 2020.
  14. ^ Bethard, S. (2022). We need to talk about random seeds. ArXiv, abs/2210.13393.

Read other articles:

Manfred WeberPemimpin Kelompok Partai Rakyat Eropa di Parlemen EropaPetahanaMulai menjabat 4 Juni 2014PresidenDonald TuskPendahuluJoseph DaulAnggota Parlemen EropaPetahanaMulai menjabat 13 Juni 2004AfiliasiKelompok EPPDaerah pemilihanJerman Informasi pribadiLahirlahir 14 Juli 1972Landshut, Bavaria, Jerman BaratPartai politik Persatuan Sosial Kristen di Bayern Kelompok Partai Rakyat EropaAlma materUniversitas München untuk Ilmu TerapanSunting kotak info • L • B Manfred W...

 

西宮市立西宮高等学校 2005年3月撮影 北緯34度45分21秒 東経135度20分35.7秒 / 北緯34.75583度 東経135.343250度 / 34.75583; 135.343250座標: 北緯34度45分21秒 東経135度20分35.7秒 / 北緯34.75583度 東経135.343250度 / 34.75583; 135.343250過去の名称 西宮町立西宮高等女学校西宮市立西宮高等女学校西宮市立建石高等学校国公私立の別 公立学校設置者 西宮市学区 第2学

 

هذه المقالة تحتاج للمزيد من الوصلات للمقالات الأخرى للمساعدة في ترابط مقالات الموسوعة. فضلًا ساعد في تحسين هذه المقالة بإضافة وصلات إلى المقالات المتعلقة بها الموجودة في النص الحالي. (نوفمبر 2022) امل تبسة الطقم الأساسي الطقم الاحتياطي تعديل مصدري - تعديل   امل تبسة ( بالفرن

Iwan von Müller Iwan Philipp Eduard Müller, ab 1889 Ritter von Müller, (* 20. Mai 1830 in Wunsiedel; † 20. Juli 1917 in München) war ein deutscher klassischer Philologe und Pädagoge, der als Professor an den Universitäten Erlangen (1864–1893) und München (1893–1906) wirkte. Er ist besonders als Begründer des Handbuchs der Altertumswissenschaft (HdA) bekannt. Inhaltsverzeichnis 1 Leben 2 Leistungen 3 Ehrungen 4 Literatur 5 Weblinks 6 Einzelnachweise Leben Iwan Müller war der Soh...

 

Замахфр. L'Attentat Французький постер до фільмуЖанр драма, трилерРежисер Ів БуассеПродюсери Джуліані Дж. Де НегріІвон ГезельТулліо ОдевайнеАльфонсо СансонеСценарист Бен БарзменБасиліо ФранкінаХорхе СемпрунУ головних ролях Жан-Луї Трентіньян Мішель Пікколі Джин СібергД�...

 

Katedral HaifaKatedral Santo Eliaكاتدرائية مار إلياسKatedral HaifaLokasiHaifaNegara IsraelDenominasiGereja Katolik Roma(sui iuris: Gereja Katolik Yunani Melkit)ArsitekturArsitekSammihom AtallahAdministrasiKeuskupan AgungEparki Agung Akka Katedral Santo Elia[1][2] (Arab: كاتدرائية مار إلياس, Ibrani: קתדרלת אליהו הנביא), disebut juga Katedral Yunani Melkit Santo Elia,[3] adalah sebuah gereja katedral Katolik ...

Автострада А5 Автострада А5Загальні даніКраїна  ІталіяМережа автостради в ІталіїНомер A5 П'ємонт – Валле-д'АостаДовжина 143.4 кмНапрямок північ-південьпочаток Туринкінець ФранціяДорожнє покриття асфальтобетонOpenStreetMap ↑44874 ·R (П'ємонт, Валле-д'Аоста)  Автостр�...

 

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada November 2022. AwakenedSutradara Joycelyn Engle Arno Malarone Produser Julianne Michelle Joycelyn Engle Ditulis oleh Joycelyn Engle Christopher M. Capwell CeritaJoycelyn EnglePemeranJulianne MichelleSteven BauerJohn SavageStelio SavanteBryan DechartSally KirklandEdw...

 

Reprezentacja Malezji w piłce nożnej mężczyzn Przydomek Harimau (Tygrysy) Trener Tan Cheng Hoe Skrót FIFA MAS Ranking FIFA 147. (1057.59 pkt.)[a] Zawodnicy Najwięcej występów R Arumugam (196) Najwięcej bramek Mokhtar Dahari (125) Strojedomowe Strojewyjazdowe Mecze Pierwszy mecz Korea Południowa 3 - 2 Malezja(Singapur, 13 kwietnia 1953) Najwyższe zwycięstwo Malezja 15 – 1  FilipinyDżakarta, 27 sierpnia 1962 Najwyższa porażka Japonia 13 – 0 MalezjaTokio, 27 wrz...

Fragmen Malam karya Wing Kardjo Wing Kardjo (lahir di Garut, Jawa Barat pada 23 April 1937 - meninggal di Jepang pada 19 Maret 2002) adalah seorang penyair Indonesia yang aktif pada masa pemapanan sastra Indonesia tahun 1965 hingga 1998.[1][2] Riwayat hidup Terlahir di provinsi Jawa Barat, Wing Kardjo mengenyam pendidikan tingkat SD dan SMP di Tasikmalaya, lalu pindah ke Garut untuk menempuh pendidikan SMA. Setelah lulus, Wing Kardjo pergi ke Jakarta untuk mengambil sekolah ba...

 

1985 studio album by Dave Grusin and Lee RitenourHarlequinStudio album by Dave Grusin and Lee RitenourReleased1985 (1985)Recorded1985Studio Starlight Studio (Burbank California) Capitol Studios and Sunset Sound (Hollywood, California) GenreJazzLength46:38LabelGRPProducerDave Grusin, Lee RitenourDave Grusin chronology Banded Together(1984) Harlequin(1985) Cinemagic(1987) Lee Ritenour chronology Banded Together(1984) Harlequin(1984) Earth Run(1986) Harlequin is a collaborative stud...

 

Defunct college football bowl game Grantland Rice Bowl (defunct) StadiumHorace Jones Field (1964–68)BREC Memorial Stadium (1969–73)Tiger Stadium (1974–75)Dacotah Field (1976)LocationMurfreesboro, Tennessee (1964–68)Baton Rouge, Louisiana (1969–75)Fargo, North Dakota (1976)Anniston, Alabama (1977)Operated1964–1977 The Grantland Rice Bowl was an annual college football bowl game from 1964 through 1977, in the NCAA's College Division, for smaller universities and colleges, and later ...

Bank operating in the United Kingdom For similarly named institutions, see Metrobank. Metro Bank plcTypePublic limited companyTraded asLSE: MTROIndustryBankingFinancial servicesFounded29 July 2010; 13 years ago (2010-07-29)FoundersAnthony ThomsonVernon HillHeadquartersLondonNumber of locations76Key peopleSir Michael Snyder(Chairman)Daniel Frumkin[1](Chief Executive Officer)ProductsCredit cards, consumer banking, corporate bankingRevenue£404.1 million (2018)[...

 

Anew McMaster in 1957 Andrew Anew McMaster (24 December 1891 – 24 August 1962) was a British stage actor who during his nearly 45 year acting career toured the UK, Ireland, Australia and the United States. For almost 35 years he toured as actor-manager of his own theatrical company performing the works of Shakespeare and other playwrights.[1] Early life He was born as Andrew McMaster, the son of Liverpool-born Andrew McMaster (1855–1940), a Master Stevedore, and Alice Maud...

 

Penn Jillette Penn Fraser Jillette (lahir 5 Maret 1955 di Greenfield, Massachusetts) adalah seorang aktor, pesulap, dan pelawak AS, dikenal lewat karyanya bersama kelompok Penn & Teller. Filmografi Michael Moore Hates America (2004) Toy Story (1995) Penn & Teller Get Killed (1989) Pranala luar Penn & Teller's official website The Penn Jillette Show official website The Unofficial Penn Jillette Fansite Penn Jillette di Internet Broadway Database Penn Jillette di IMDb (dalam bahasa ...

Series of celebrations Centennial of the Proclamation of Philippine IndependenceLogoDate1998LocationPrimarily in the Philippines[1](some events held outside the country)TypeSeries of commemorationsOrganized byNational Centennial CommissionSloganKalayaan, Kayamanan ng Bayan(transl. Freedom, Wealth of the Nation)Websitewww.philcentennial.com (archived) The Philippine Centennial was a series of celebrations by the Philippine government to primarily commemorate the 100th anniversary...

 

1938 film Captain BenoitDirected byMaurice de CanongeWritten byAlbert Guyot Charles Robert-Dumas (novel) Bernard ZimmerProduced byAntoine de Rouvre Jacques Schwob-d'HéricourtStarringJean Murat Mireille Balin Madeleine RobinsonCinematographyRaymond Clunie Marcel LucienEdited byYvonne MartinMusic byJean LenoirProductioncompaniesCompagnie Française Cinématographique Société des Films VegaDistributed byCompagnie Française de Distribution CinématographiqueRelease date30 December 1938Running...

 

  لمعانٍ أخرى، طالع الحافة (توضيح). الحافة (محلة) تقسيم إداري البلد  اليمن المحافظة محافظة إب المديرية مديرية حزم العدين العزلة عزلة بني على القرية قرية كباهبة السكان التعداد السكاني 2004 السكان 24   • الذكور 12   • الإناث 12   • عدد الأسر 4   • عدد المساكن 4 معلوم�...

Italian decathlete William FrullaniWilliam Frullani at the 2008 TNT - Fortuna Meeting in KladnoPersonal informationNationalityItalianBorn (1979-09-21) 21 September 1979 (age 44)Prato, ItalyHeight1.91 m (6 ft 3 in)Weight93 kg (205 lb)SportCountry ItalySportAthleticsEventCombined eventsClubC.S. CarabinieriAchievements and titlesPersonal bests Decathlon: 7984 (2002) Heptathlon indoor: 5972 (2009) Medal record Event 1st 2nd 3rd European Cup Combined Events 1 0 0 Euro...

 

For an unfocused gaze, see Thousand-yard stare. 2012 studio album by ChicaneThousand Mile StareStudio album by ChicaneReleasedApril 16, 2012Recorded2011–2012GenreTrance, house, electronica, ambientLength66:46LabelModena, ArmadaProducerChicane, Richard Searle, Nick MuirChicane chronology Giants(2010) Thousand Mile Stare(2012) The Sum of Its Parts(2015) Singles from Thousand Mile Stare Going DeepReleased: 14 August 2011 ThreeReleased: 10 June 2012 Professional ratingsReview scoresSour...

 
Kembali kehalaman sebelumnya