A Study of Model Selection for Electric Data using Cross Validation Approach

@article{ART002298240},
author={ Saraswathi Sivamani and Saravana Kumar and 신창선 and 박장우 and 조용윤 },
title={A Study of Model Selection for Electric Data using Cross Validation Approach},
journal={한국지식정보기술학회 논문지},
issn={1975-7700},
year={2017},
volume={12},
number={6},
pages={837-844},
doi={10.34163/jkits.2017.12.6.005},
url={http://dx.doi.org/10.34163/jkits.2017.12.6.005}

TY - JOUR
AU - Saraswathi Sivamani
AU - Saravana Kumar
AU - 신창선
AU - 박장우
AU - 조용윤
TI - A Study of Model Selection for Electric Data using Cross Validation Approach
T2 - 한국지식정보기술학회 논문지
PY - 2017
VL - 12
IS - 6
PB - 한국지식정보기술학회
SP - 837-844
SN - 1975-7700
AB - In this paper, the appropriate model is selected for the risk assessment of the electric utility pole data with the help of cheat sheets and k-fold cross validation. In order to analyze, predict and forecast the data, the appropriate model has to be selected. The major issue is the declination of the accuracy in the model fitting, which may result in poor model selection. There are different type of machine learning algorithm, which makes it difficult to conclude the model selection. To ensure the proper selection of the model, we undergo a two-step process. Firstly, the basic model is selected with the existing model selection cheat sheets named as Scikit learn and Microsoft azure, by understanding the available input and required output of the data. After getting through the multiple question, the respective models such as Generalized Additive Model, Generalized Linear Model, Linear Regression and Support Vector Machine are obtained. In order to attain the appropriate model, we perform k-fold cross validation to estimate the risk of the algorithms, by comparing 2-fold, 8-fold and 10-fold cross validation. Between the three set, the 10-cross fold validation of generalized additive model is selected with the least risk error. Using k-fold cross validation, we estimate the accuracy of the model that is suitable for the data, by using the electric power data set.
KW - Model selection, K-fold cross validation, Machine learning, Model fit, Electric power
DO - 10.34163/jkits.2017.12.6.005
UR - http://dx.doi.org/10.34163/jkits.2017.12.6.005
ER -

Saraswathi Sivamani , Saravana Kumar , 신창선 , 박장우 and 조용윤 (2017). A Study of Model Selection for Electric Data using Cross Validation Approach. 한국지식정보기술학회 논문지, 12( 6), 837- 844.

Saraswathi Sivamani , Saravana Kumar , 신창선 , 박장우 and 조용윤 . 2017, “A Study of Model Selection for Electric Data using Cross Validation Approach”, 한국지식정보기술학회 논문지, vol. 12, no. 6, pp. 837-844. Available from: doi:10.34163/jkits.2017.12.6.005

Saraswathi Sivamani Saravana Kumar et al. 신창선 박장우 조용윤 “A Study of Model Selection for Electric Data using Cross Validation Approach” 한국지식정보기술학회 논문지 12.6 pp. 837-844 (2017): 837.

Saraswathi Sivamani , Saravana Kumar , 신창선 , 박장우 , 조용윤 . A Study of Model Selection for Electric Data using Cross Validation Approach 한국지식정보기술학회 논문지 [Internet]. 2017; 12( 6), : 837-844. Available from: doi:10.34163/jkits.2017.12.6.005

Saraswathi Sivamani , Saravana Kumar , 신창선 , 박장우 and 조용윤 . “A Study of Model Selection for Electric Data using Cross Validation Approach” 한국지식정보기술학회 논문지 12, no.6 (2017): 837-844. doi: 10.34163/jkits.2017.12.6.005

홈 권호 목록 논문 목록 논문 상세

A Study of Model Selection for Electric Data using Cross Validation Approach

한국지식정보기술학회 논문지

약어 : 한국지식정보기술학회 논문지

2017, vol.12, no.6, pp.837 - 844

DOI : 10.34163/jkits.2017.12.6.005

발행기관 : 한국지식정보기술학회

연구분야 : 학제간연구

Saraswathi Sivamani¹ , Saravana Kumar² , 신창선³ , 박장우⁴ , 조용윤⁵

¹순천대학교

²순천대학교

³순천대학교

⁴순천대학교

⁵순천대학교

인용한 논문 수 : - 서지 간략 보기

초록

In this paper, the appropriate model is selected for the risk assessment of the electric utility pole data with the help of cheat sheets and k-fold cross validation. In order to analyze, predict and forecast the data, the appropriate model has to be selected. The major issue is the declination of the accuracy in the model fitting, which may result in poor model selection. There are different type of machine learning algorithm, which makes it difficult to conclude the model selection. To ensure the proper selection of the model, we undergo a two-step process. Firstly, the basic model is selected with the existing model selection cheat sheets named as Scikit learn and Microsoft azure, by understanding the available input and required output of the data. After getting through the multiple question, the respective models such as Generalized Additive Model, Generalized Linear Model, Linear Regression and Support Vector Machine are obtained. In order to attain the appropriate model, we perform k-fold cross validation to estimate the risk of the algorithms, by comparing 2-fold, 8-fold and 10-fold cross validation. Between the three set, the 10-cross fold validation of generalized additive model is selected with the least risk error. Using k-fold cross validation, we estimate the accuracy of the model that is suitable for the data, by using the electric power data set.

키워드

Model selection, K-fold cross validation, Machine learning, Model fit, Electric power

참고문헌(16)

[단행본] R. S. Michalski / 2013 / Machine learning: An artificial intelligence approach / Springer Science & Business Media
[단행본] C. M. Bishop / 2006 / Pattern recognition and machine learning (information science and statistics) / springer-verlag new york. Inc
[단행본] D. J. Hand / 2001 / Principles of data mining / MIT press
[학술지] A. K. Jain / 2000 / Statistical pattern recognition: A review / IEEE Transactions on pattern analysis and machine intelligence 22 (1) : 4 ~ 37
[단행본] T. O. Ayodele / 2010 / New advances in machine learning / InTech
[보고서] S. Arlot / 2010 / A survey of cross-validation procedures for model selection : 40 ~ 79
[학술대회] B. Gu / 2015 / A new generalized error path algorithm for model selection / International Conference on Machine Learning : 2549 ~ 2558
[학술지] R. R. Bies / 2006 / A genetic algorithm-based, hybrid machine learning approach to model selection / Journal of Pharmacokinetics and Pharmacodynamics 33 (2) : 195 ~ 221
[학술지] F. Pedregosa / 2011 / Scikit-learn: Machine learning in Python / Journal of Machine Learning Research 12 : 2825 ~ 2830
[단행본] S. Mund / 2015 / Microsoft azure machine learning / Packt Publishing Ltd
[단행본] S. Chatterjee / 2015 / Regression analysis by example / John Wiley & Sons
[단행본] D. Michie / 1994 / Machine learning, neural and statistical classification
[단행본] S. B. Kotsiantis / 2007 / Supervised machine learning: A review of classification techniques
[학술지] T. Kanungo / 2002 / An efficient k-means clustering algorithm:Analysis and implementation / IEEE transactions on pattern analysis and machine intelligence 24 (7) : 881 ~ 892
[학술지] F. Pedregosa / 2011 / Scikit-learn: Machine learning in Python / Journal of Machine Learning Research 12 : 2825 ~ 2830
[인터넷자료] / Microsoft Azure