loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Pedro Sandoval Segura 1 ; 2 ; Julius Lauw 1 ; Daniel Bashir 1 ; Kinjal Shah 1 ; Sonia Sehra 1 ; Dominique Macias 1 and George Montañez 1

Affiliations: 1 AMISTAD Lab, Department of Computer Science, Harvey Mudd College, Claremont, CA, U.S.A. ; 2 Department of Computer Science, University of Maryland, College Park, MD, U.S.A.

Keyword(s): Machine Learning, Model Complexity, Algorithm Capacity, VC Dimension, Label Recorder.

Abstract: Algorithm performance in supervised learning is a combination of memorization, generalization, and luck. By estimating how much information an algorithm can memorize from a dataset, we can set a lower bound on the amount of performance due to other factors such as generalization and luck. With this goal in mind, we introduce the Labeling Distribution Matrix (LDM) as a tool for estimating the capacity of learning algorithms. The method attempts to characterize the diversity of possible outputs by an algorithm for different training datasets, using this to measure algorithm flexibility and responsiveness to data. We test the method on several supervised learning algorithms, and find that while the results are not conclusive, the LDM does allow us to gain potentially valuable insight into the prediction behavior of algorithms. We also introduce the Label Recorder as an additional tool for estimating algorithm capacity, with more promising initial results.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.218.184.214

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Segura, P.; Lauw, J.; Bashir, D.; Shah, K.; Sehra, S.; Macias, D. and Montañez, G. (2020). The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-395-7; ISSN 2184-433X, SciTePress, pages 980-986. DOI: 10.5220/0009178209800986

@conference{icaart20,
author={Pedro Sandoval Segura. and Julius Lauw. and Daniel Bashir. and Kinjal Shah. and Sonia Sehra. and Dominique Macias. and George Montañez.},
title={The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2020},
pages={980-986},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009178209800986},
isbn={978-989-758-395-7},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity
SN - 978-989-758-395-7
IS - 2184-433X
AU - Segura, P.
AU - Lauw, J.
AU - Bashir, D.
AU - Shah, K.
AU - Sehra, S.
AU - Macias, D.
AU - Montañez, G.
PY - 2020
SP - 980
EP - 986
DO - 10.5220/0009178209800986
PB - SciTePress