Suche

Wo soll gesucht werden?
Erweiterte Literatursuche

Ariadne Pfad:

Inhalt

Literaturnachweis - Detailanzeige

 
Autor/inMo, Yuji
TitelCost-Aware Hierarchical Active Learning and Sub-Linear Time Embedding Based Deep Retrieval
Quelle(2022), (118 Seiten)
PDF als Volltext Verfügbarkeit 
Ph.D. Dissertation, The University of Nebraska - Lincoln
Spracheenglisch
Dokumenttypgedruckt; online; Monographie
ISBN979-8-4387-5750-4
SchlagwörterHochschulschrift; Dissertation; Active Learning; Algorithms; Classification; Models; Information Retrieval; Indexing; Naming
AbstractThe research in this dissertation consists of two parts: An active learning algorithm for hierarchical labels and an embedding-based retrieval algorithm. In the first part, we present a new approach for learning hierarchically decomposable concepts. The approach learns a high-level classifier (e.g., location vs. non-location) by separately learning multiple finer-grained classifiers (e.g., museum vs. non-museum), and then combining the results. Soliciting labels at a finer level of granularity than that of the target concept is a new approach to active learning, which we term active over-labeling. In experiments in NER and document classification tasks, we show that active over-labeling substantially improves area under the precision-recall curve when compared with standard passive or active learning. Finally, because finer-grained labels may be more expensive to obtain, we also present a cost-sensitive active learner that uses a multi-armed bandit approach to dynamically choose the label granularity to target, and show that the bandit-based learner is robust to differences in label cost and labeling budget. In the second part, we present a Bayesian Deep Structured Semantic Model (BDSSM) that efficiently in retrieval tasks with a large pool of candidates for real-time applications, e.g., in search engines, digital ads, and recommendation systems. The efficiency is achieved by indexing the items into groups based on their sparse representation of embeddings during offline pre-computation. In the online retrieval phase, the algorithm only retrieves and ranks items from indices that are relevant to the query. We explore optimization strategies in the algorithm to make sparse representation sparser. In evaluation, the algorithm is compared with other popular clustering-based, hashing-based, and tree-based retrieval methods. We measure the differences in multiple dimensions, including retrieval recall, storage of embeddings, and CPU time. We show that this algorithm outperforms other algorithms in the comparison of both recall and CPU time with the same storage limit. Finally, we also show that this algorithm can be used in exploration when the model is recurrently retrained. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.] (As Provided).
AnmerkungenProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Erfasst vonERIC (Education Resources Information Center), Washington, DC
Update2024/1/01
Literaturbeschaffung und Bestandsnachweise in Bibliotheken prüfen
 

Standortunabhängige Dienste
Die Wikipedia-ISBN-Suche verweist direkt auf eine Bezugsquelle Ihrer Wahl.
Tipps zum Auffinden elektronischer Volltexte im Video-Tutorial

Trefferlisten Einstellungen

Permalink als QR-Code

Permalink als QR-Code

Inhalt auf sozialen Plattformen teilen (nur vorhanden, wenn Javascript eingeschaltet ist)

Teile diese Seite: