Share to: share facebook share twitter share wa share telegram print page

Coupled pattern learner

Coupled Pattern Learner (CPL) is a machine learning algorithm which couples the semi-supervised learning of categories and relations to forestall the problem of semantic drift associated with boot-strap learning methods.

Coupled Pattern Learner

Semi-supervised learning approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classifiers for many different categories and relations in the presence of an ontology defining constraints that couple the training of these classifiers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009.[1][2]

CPL overview

CPL is an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors. Basic idea behind CPL is that semi-supervised training of a single type of extractor such as ‘coach’ is much more difficult than simultaneously training many extractors that cover a variety of inter-related entity and relation types. Using prior knowledge about the relationships between these different entities and relations CPL makes unlabeled data as a useful constraint during training. For e.g., ‘coach(x)’ implies ‘person(x)’ and ‘not sport(x)’.

CPL description

Coupling of predicates

CPL primarily relies on the notion of coupling the learning of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways.

  1. Sharing among same-arity predicates according to logical relations
  2. Relation argument type-checking

Sharing among same-arity predicates

Each predicate P in the ontology has a list of other same-arity predicates with which P is mutually exclusive. If A is mutually exclusive with predicate B, A’s positive instances and patterns become negative instances and negative patterns for B. For example, if ‘city’, having an instance ‘Boston’ and a pattern ‘mayor of arg1’, is mutually exclusive with ‘scientist’, then ‘Boston’ and ‘mayor of arg1’ will become a negative instance and a negative pattern respectively for ‘scientist.’ Further, Some categories are declared to be a subset of another category. For e.g., ‘athlete’ is a subset of ‘person’.

Relation argument type-checking

This is a type checking information used to couple the learning of relations and categories. For example, the arguments of the ‘ceoOf’ relation are declared to be of the categories ‘person’ and ‘company’. CPL does not promote a pair of noun phrases as an instance of a relation unless the two noun phrases are classified as belonging to the correct argument types.

Algorithm description

Following is a quick summary of the CPL algorithm.[2]

Input: An ontology O, and a text corpus C 
Output: Trusted instances/patterns for each predicate
for i=1,2,...,∞ do
    foreach predicate p in O do
        EXTRACT candidate instances/contextual patterns using recently promoted patterns/instances;
        FILTER candidates that violate coupling;
        RANK candidate instances/patterns;
        PROMOTE top candidates;
    end
end

Inputs

A large corpus of Part-Of-Speech tagged sentences and an initial ontology with predefined categories, relations, mutually exclusive relationships between same-arity predicates, subset relationships between some categories, seed instances for all predicates, and seed patterns for the categories.

Candidate extraction

CPL finds new candidate instances by using newly promoted patterns to extract the noun phrases that co-occur with those patterns in the text corpus. CPL extracts,

  • Category Instances
  • Category Patterns
  • Relation Instances
  • Relation Patterns

Candidate filtering

Candidate instances and patterns are filtered to maintain high precision, and to avoid extremely specific patterns. An instance is only considered for assessment if it co-occurs with at least two promoted patterns in the text corpus, and if its co-occurrence count with all promoted patterns is at least three times greater than its co-occurrence count with negative patterns.

Candidate ranking

CPL ranks candidate instances using the number of promoted patterns that they co-occur with so that candidates that occur with more patterns are ranked higher. Patterns are ranked using an estimate of the precision of each pattern.

Candidate promotion

CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are only promoted if they co-occur with at least two promoted patterns or instances, respectively.

Meta-Bootstrap Learner

Meta-Bootstrap Learner (MBL) was also proposed by the authors of CPL.[2] Meta-Bootstrap learner couples the training of multiple extraction techniques with a multi-view constraint, which requires the extractors to agree. It makes addition of coupling constraints on top of existing extraction algorithms, while treating them as black boxes, feasible. MBL assumes that the errors made by different extraction techniques are independent. Following is a quick summary of MBL.

Input: An ontology O, a set of extractors ε
Output: Trusted instances for each predicate
for i=1,2,...,∞ do
    foreach predicate p in O do
        foreach extractor e in ε do
            Extract new candidates for p using e with recently promoted instances;
        end
        FILTER candidates that violate mutual-exclusion or type-checking constraints;
        PROMOTE candidates that were extracted by all extractors;
    end
end

Subordinate algorithms used with MBL do not promote any instance on their own, they report the evidence about each candidate to MBL and MBL is responsible for promoting instances.

Applications

In their paper [1] authors have presented results showing the potential of CPL to contribute new facts to existing repository of semantic knowledge, Freebase [3]

See also

Notes

  1. ^ a b Carlson, Andrew; Justin Betteridge; Estevam R. Hruschka Jr.; Tom M. Mitchell (2009). "Coupling semi-supervised learning of categories and relations". Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing. Colorado, USA: Association for Computational Linguistics: 1–9. ISBN 9781932432381.
  2. ^ a b c Carlson, Andrew; Justin Betteridge; Richard C. Wang; Estevam R. Hruschka Jr.; Tom M. Mitchell (2010). "Coupled semi-supervised learning for information extraction". Proceedings of the third ACM international conference on Web search and data mining. NY, USA: ACM. pp. 101–110. doi:10.1145/1718487.1718501. ISBN 9781605588896.
  3. ^ "Freebase data dumps". Metaweb Technologies. 2009. Archived from the original on December 6, 2011. {{cite journal}}: Cite journal requires |journal= (help)

References

  • Liu, Qiuhua; Xuejun Liao; Lawrence Carin (2008). "Semi-supervised multitask learning". NIPS.
  • Shinyama, Yusuke; Satoshi Sekine (2006). "Preemptive information extraction using unrestricted relation discovery". HLT-Naacl.
  • Chang, Ming-Wei; Lev-Arie Ratinov; Dan Roth (2007). "Guiding semi-supervision with constraint driven learning". ACL.
  • Banko, Michele; Michael J. Cafarella; Stephen Soderland; Matt Broadhead; Oren Etzioni (2007). "Open information extraction from the web". IJCAI.
  • Blum, Avrim; Tom Mitchell (1998). "Combining labeled and unlabeled data with co-training". Proceedings of the eleventh annual conference on Computational learning theory. pp. 92–100. doi:10.1145/279943.279962. ISBN 1581130570. S2CID 207228399. {{cite book}}: |journal= ignored (help)
  • Riloff, Ellen; Rosie Jones (1999). "Learning dictionaries for information extraction by multi-level bootstrapping". AAAI.
  • Rosenfeld, Benjamin; Ronen Feldman (2007). "Using corpus statistics on entities to improve semi-supervised relation extraction from the web". ACL.
  • Wang, Richard C.; William W. Cohen (2008). "Iterative set expansion of named entities using the web". ICDM.

Read other articles:

JKT48 adalah grup idola asal Indonesia. Dibentuk pada tahun 2011, JKT48 merupakan grup saudari AKB48 pertama yang berada di luar Jepang.[1] Grup ini mengadopsi konsep AKB48 yaitu idola yang dapat anda jumpai setiap hari.[2] JKT48 mengadakan pertunjukan di Theater JKT48, lantai 4 mal fX Sudirman, Jakarta. Per tanggal 3 September 2023, JKT48 memiliki 41 orang anggota secara individu. Grup terbagi dalam 11 generasi. Sebelas generasi tersebut masing-masing meliputi generasi pertam...

 

Léon BrunschvicgInformación personalNacimiento 10 de noviembre de 1869 París (Sena, Francia) Fallecimiento 18 de enero de 1944 (74 años)Aix-les-Bains (Francia) Nacionalidad FrancesaFamiliaCónyuge Cécile Brunschvicg EducaciónEducado en Liceo CondorcetEscuela Normal Superior de París Alumno de Émile Boutroux Información profesionalOcupación Filósofo, historiador, escritor y profesor universitario Empleador Universidad de París Estudiantes doctorales Gaston Bachelard, Raymond Aron y...

 

Опис Постер фільму Великий Гетсбі Джерело http://web.stanford.edu/~derekb/great_gatsby_h.jpg Час створення 1926 Автор зображення Paramount Pictures Ліцензія Це зображення є рекламним плакатом фільму, спортивного або іншого заходу. Найімовірніше, авторськими правами на обкладинку володіє видавець філь�...

Esta biografia de uma pessoa viva cita fontes, mas que não cobrem todo o conteúdo. Ajude a inserir fontes confiáveis e independentes. Material controverso que esteja sem fontes deve ser imediatamente removido, especialmente se for de natureza difamatória.—Encontre fontes: ABW  • CAPES  • Google (N • L • A) (Junho de 2017) Botelho de Mesquita Nome completo José Ignácio Botelho de Mesquita Nascimento 11 de julho de 1935São P...

 

Edith Lucie Bongo Información personalNacimiento 10 de marzo de 1964 Brazzaville (República del Congo) Fallecimiento 14 de marzo de 2009 (45 años)Rabat (Marruecos) Causa de muerte Cáncer Nacionalidad Congoleña y gabonesaReligión Islam FamiliaPadre Denis Sassou-Nguesso Cónyuge Omar Bongo Información profesionalOcupación Médica Partido político Partido Democrático Gabonés [editar datos en Wikidata]Édith Lucie Bongo Ondimba (10 de marzo de 1964 – 14 de marzo de 2009) fu...

 

Tony XuXu pada 2018LahirXu Xun1983/1984 (umur 38–39)[1]Nanjing, TiongkokKebangsaanAmerika SerikatAlmamater Universitas California, Berkeley Universitas Stanford PekerjaanCEO DoorDashAnggota dewanDoorDashSuami/istriPatti XuAnak2 Tony Xu (nama lahir Xu Xun, 1983/1984) adalah seorang pengusaha miliuner Tionghoa-Amerika, serta salah satu pendiri dan kepala jabatan eksekutif (CEO) DoorDash. Lahir di Nanjing, Tiongkok, Xu berimigrasi ke Amerika Serikat dengan orangtuanya dalam usia l...

Demography of the Philippines This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Ethnic groups in the Philippines – news · newspapers · books · scholar · JSTOR (October 2021) (Learn how and when to remove this template message) Dominant ethnic groups by province. The Philippines is inhabited by more than 182 et...

 

2009 edition of the MLS Cup Football matchMLS Cup 2009EventMLS Cup Real Salt Lake LA Galaxy 1 1 After extra timeReal Salt Lake won 5–4 on penaltiesDateNovember 22, 2009VenueQwest Field, Seattle, Washington, USMan of the MatchNick Rimando(Real Salt Lake)RefereeKevin StottAttendance46,011WeatherCloudy, 45 °F (7 °C)← 2008 2010 → MLS Cup 2009 was the 14th edition of the MLS Cup, the championship match of Major League Soccer (MLS). The soccer match took place on November ...

 

Eine betriebswirtschaftliche Kennzahl ist eine Kennzahl, die zur Beurteilung von Unternehmen herangezogen und aus Unternehmensdaten (Mikrodaten) gewonnen wird. Sie werden im Rahmen von Kennzahlensystemen eingesetzt. Pendant sind volkswirtschaftliche Kennzahlen auf der Makroebene. Inhaltsverzeichnis 1 Allgemeines 2 Aufgaben 3 Arten 4 Funktion von Kennzahlen 5 Gliederung von Kennzahlen 5.1 Erfolgskennzahlen 5.2 Liquiditätskennzahlen 5.3 Rentabilitätskennzahlen 5.4 Kennzahlen zur Kapitalstrukt...

Синдром Тернера Дівчинка з синдромом Шерешевського-Тернера до і після операції на шиїДівчинка з синдромом Шерешевського-Тернера до і після операції на шиї Інші назви Синдром Ульріха-Тернера, gonadal dysgenesis, 45X, 45X0Спеціальність Педіатрія, медична генетикаСимптоми Webbed neck[en], sho...

 

Countries near to North Pole (Arctic Region) The wintery Lapporten mountain pass in Lappland, Sweden. The climate of the Nordic countries is that of a region in Northern Europe that consists of Denmark, Finland, Iceland, Norway and Sweden and their associated territories, which include the Faroe Islands, Greenland and Åland. Stockholm, Sweden has on average the warmest summer of the Nordic capitals, with an average maximum temperature of 23 °C (73 °F) in July; Copenhagen, Oslo an...

 

Jordens Peters Informasi pribadiNama lengkap Jordens PetersTanggal lahir 3 Mei 1987 (umur 36)Tempat lahir Nijmegen, BelandaTinggi 1,82 m (5 ft 11+1⁄2 in)Posisi bermain BekInformasi klubKlub saat ini Willem IINomor 4Karier junior BMC FC Den BoschKarier senior*Tahun Tim Tampil (Gol)2005–2012 FC Den Bosch 180 (3)2012– Willem II 12 (1) * Penampilan dan gol di klub senior hanya dihitung dari liga domestik dan akurat per 14:17, 2 September 2012 (UTC) Jordens Peter...

Арєшкін Петро ІвановичНародження 1 січня 1911(1911-01-01)Рязанська губернія, Російська імперіяСмерть невідомоКраїна(підданство)  СРСРНавчання Московський архітектурний інститутДіяльність архітекторПраця в містах Харків Петро Іванович Арєшкін (1 січня 1911, Дмитрієвка — ...

 

Culture of an area This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Culture of Singapore – news · newspapers · books · scholar · JSTOR (March 2022) (Learn how and when to remove this template message) Centre square of Raffles Place Old Supreme Court of Singapore The culture of Singapore has changed greatly ov...

 

1984 EP by TNTTNTEP by TNTReleased1984RecordedJanuary 1984 in Nidaros Studios, TrondheimGenreHeavy metal, hard rockLabelPolyGramProducerBjørn Nessjø TNT is a 1984 English-language EP based on a selection of Norwegian-language songs from TNT's debut album TNT, released two years previously.[1] During Norwegian tours backing their 1982 Norwegian language debut, TNT experienced growing interest from abroad, and decided to record the first five songs, side 1 from their album in ...

James Blunt discographyJames Blunt performing in 2022Studio albums7Live albums3Compilation albums1Music videos37EPs11Singles34 The discography of James Blunt, a British pop rock singer, contains seven studio albums, two live albums, one compilation album, eleven extended plays and thirty-four singles. Blunt's debut album, Back to Bedlam, was released by Atlantic Records in the United Kingdom in October 2004 and peaked at number one on the UK Albums Chart in July 2005 and within the top five a...

 

Odisha is situated in eastern coast on Bay of Bengal. Map of the coastline around Chilka Lake Satellite view of the Mahanadi River near Subarnapur, in Subarnapur district of Odisha. Here the Mahanadi curves around the Garhjat Hills before entering the Utkal Plains. View of the banks of the Daya River from a top Dhauli Hills, the presumed venue of the Kalinga War. Odisha (formerly known as Orissa) is one of the 28 states in the Republic of India. Odisha is located in the eastern part of the In...

 

Norwegian politician Christopher HornsrudHornsrud in 193018th Prime Minister of NorwayIn office28 January 1928 – 15 February 1928MonarchHaakon VIIPreceded byIvar LykkeSucceeded byJohan Ludwig MowinckelMinister of FinanceIn office28 January 1928 – 15 February 1928Prime MinisterHimselfPreceded byFredrik L. KonowSucceeded byPer Berg LundVice President of the StortingIn office16 February 1928 – 10 January 1934PresidentC. J. HambroPreceded byJohan Ludwig ...

This article relies largely or entirely on a single source. Relevant discussion may be found on the talk page. Please help improve this article by introducing citations to additional sources.Find sources: The Underachievers film – news · newspapers · books · scholar · JSTOR (December 2016) 1987 American filmThe UnderachieversDirected byJackie KongWritten byJackie KongStarringEdward AlbertBarbara CarreraMichael PatakiVic TaybackMark BlankfieldSusan...

 

Defunct American music magazine (1966–1979) CrawdaddyFormer editorsPaul WilliamsPeter KnoblerTotal circulation(1976)160,000[1]FounderPaul WilliamsFirst issueFebruary 7, 1966 (1966-02-07)Final issueMay 1979 (1979-05)CountryUnited StatesBased inSwarthmore, Pennsylvania;Cambridge, Massachusetts;New York, New YorkLanguageEnglishWebsitepastemagazine.com/crawdaddyISSN0011-0833 Crawdaddy was an American rock music magazine launched in 1966. It was created by Paul W...

 
Kembali kehalaman sebelumnya