Best Model for Swiss Banknote Data

Shinmura, Shuichi

doi:10.1007/978-981-10-2164-0_6

Shuichi Shinmura²

925 Accesses
1 Citations

Abstract

When we discriminate Swiss banknote data by IP-OLDF, we find that these data are linearly separable data (LSD). Because we examine all possible combination models, we can find that a two-variable model, such as (X4, X6), is the minimum linearly separable model. A total of 16 models, including these two variables, are linearly separable by the monotonic decrease of MNM (MNM_p ≥ MNM_(p+1)), and other 47 models are not linearly separable. Therefore, we compare eight LDFs by the best models with the minimum error rate mean in the validation sample (M2) and obtain good results. Although we could not explain the useful meaning of the 95 % CI of discriminant coefficient until now, the pass/fail determination using examination scores provide a clear understanding by normalizing the coefficient in Chap. 5. Seven LDFs become trivial LDFs. Only Fisher’s LDF is not trivial. Seven LDFs are Revised IP-OLDF based on MNM, Revised LP-OLDF, Revised IPLP-OLDF, three SVMs, and logistic regression. We successfully explain the meaning of coefficient. Therefore, we discuss the relationship between the best model and coefficient more precisely by Swiss banknote data in Chap. 6. We study LSD discrimination by Swiss banknote data, Student linearly separable data in Chap. 4, six pass/fail determinations using examination scores in Chap. 5, and Japanese-automobile data in Chap. 7, precisely. When we discriminate six microarray datasets that are LSD in Chap. 8, only Revised IP-OLDF can naturally make feature-selection and reduce the high-dimensional gene space to the small gene space drastically. In gene analysis, we call all linearly separable models, “Matroska.” The full model is the largest Matroska that includes all smaller Matroskas in it. As we already knew, the smallest Matroska (BGS) can explain the Matroska structure completely through the monotonic decrease of MNM. We propose the Matroska feature-selection method for the microarray dataset (Method 2). Because LSD discrimination is no longer popular, we explain Method 2 through detailed examples of the Swiss banknote and Japanese-automobile data. On the other hand, LASSO attempts to make feature-selection. If it cannot find the small Matroska (SM) in the dataset, it cannot explain the Matroska structure. Swiss banknote data, Japanese-automobile data, and six microarray datasets are helpful for evaluating the usefulness of other feature-selection methods, including LASSO.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Buhlmann P, Geer AB (2011) Statistics for high-dimensional data-method, theory and applications. Springer, Berlin
Google Scholar
Cox DR (1958) The regression analysis of binary sequences (with discussion). J Roy Stat Soc B 20:215–242
MATH Google Scholar
Firth D (1993) Bias reduction of maximum likelihood estimates. Biometrika 80:27–39
Article MathSciNet MATH Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7:179–188
Article Google Scholar
Fisher RA (1956) Statistical methods and statistical inference. Hafner Publishing Co, New Zealand
MATH Google Scholar
Flury B, Rieduyl H (1988) Multivariate statistics: a practical approach. Cambridge University Press, Cambridge
Book Google Scholar
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175
Article MathSciNet Google Scholar
Goodnight JH (1978) SAS technical report—the sweep operator: its importance in statistical computing—(R100). SAS Institute Inc, USA
Google Scholar
Jeffery IB, Higgins DG, Culhane C (2006) Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinf 7:359:1–16. doi: 10.1186/1471-2105-7-359
Lachenbruch PA, Mickey MR (1968) Estimation of error rates in discriminant analysis. Technometrics 10:1–11
Article MathSciNet Google Scholar
Miyake A, Shinmura S (1976) Error rate of linear discriminant function. In: Gremy F (ed) Dombal FT. North-Holland Publishing Company, pp 435–445
Google Scholar
Miyake A, Shinmura S (1979) An algorithm for the optimal linear discriminant functions. Proceedings of the international conference on cybernetics and society, pp 1447–1450
Google Scholar
Sall JP (1981) SAS regression applications. SAS Institute Inc., USA. (Shinmura S. translate Japanese version)
Google Scholar
Sall JP, Creighton L, Lehman A (2004) JMP start statistics, third edition. SAS Institute Inc., USA. (Shinmura S. edits Japanese version)
Google Scholar
Schrage L (1991) LINDO—an optimization modeling systems. The Scientific Press, UK. (Shinmura S. & Takamori, H. translate Japanese version)
Google Scholar
Schrage L (2006) Optimization modeling with LINGO. LINDO Systems Inc., USA. (Shinmura S. translates Japanese version)
Google Scholar
Shinmura S (1998) Optimal linear discriminant functions using mathematical programming. J Jpn Soc Comput Stat 11/2:89–101
Google Scholar
Shinmura S, Tarumi T (2000) Evaluation of the optimal linear discriminant functions using integer programming (IP-OLDF) for the normal random data. J Jpn Soc Comput Stat 12(2):107–123
Google Scholar
Shinmura S (2000a) A new algorithm of the linear discriminant function using integer programming. New Trends Prob Stat 5:133–142
MathSciNet MATH Google Scholar
Shinmura S (2000b) Optimal linear discriminant function using mathematical programming. Dissertation, March 200:1–101, Okayama University, Japan
Google Scholar
Shinmura S (2003) Enhanced algorithm of IP-OLDF. ISI2003 CD-ROM, pp 428–429
Google Scholar
Shinmura S (2004) New algorithm of discriminant analysis using integer programming. IPSI 2004 Pescara VIP Conference CD-ROM, pp 1–18
Google Scholar
Shinmura S (2005) New age of discriminant analysis by IP-OLDF –beyond Fisher’s linear discriminant function. ISI2005, pp 1–2
Google Scholar
Shinmura S (2007) Overviews of discriminant function by mathematical programming. J Jpn Soc Comput Stat 20(1-2):59–94
Google Scholar
Shinmura S (2010a) The optimal linearly discriminant function (Saiteki Senkei Hanbetu Kansuu). Union of Japanese Scientist and Engineer Publishing, Japan
Google Scholar
Shinmura S (2010b) Improvement of CPU time of Revised IP-OLDF using Linear Programming. J Jpn Soc Comput Stat 22(1):39–57
Google Scholar
Shinmura S (2011a) Beyond Fisher’s linear discriminant analysi—new world of the discriminant analysis. ISI CD-ROM, pp 1–6
Google Scholar
Shinmura S (2011b) Problems of discriminant analysis by mark sense test data. Jpn Soc Appl Stat 40(3):157–172
Article Google Scholar
Shinmura S (2013) Evaluation of optimal linear discriminant function by 100-fold cross-validation. ISI CD-ROM, pp 1–6
Google Scholar
Shinmura S (2014a) End of discriminant functions based on variance-covariance matrices. ICORE2014, pp 5–16
Google Scholar
Shinmura S (2014b) Improvement of CPU time of linear discriminant functions based on MNM criterion by IP. Stat Optim Inf Comput 2:114–129
Article Google Scholar
Shinmura S (2014c) Comparison of linear discriminant functions by k-fold cross-validation. Data Anal 2014:1–6
Google Scholar
Shinmura S (2015a) The 95 % confidence intervals of error rates and discriminant coefficients. Stat Optim Inf Comput 2:66–78
MathSciNet Google Scholar
Shinmura S (2015b) A trivial linear discriminant function. Stat Optim Inf Comput 3:322–335. doi:10.19139/soic.20151202
Article MathSciNet Google Scholar
Shinmura S (2015c) Four serious problems and new facts of the discriminant analysis. In: Pinson E, Valente F, Vitoriano B (ed) Operations research and enterprise systems, pp 15–30. Springer, Berlin (ISSN: 1865-0929, ISBN: 978-3-319-17508-9, doi:10.1007/978-3-319-17509-6)
Google Scholar
Shinmura S (2015d) Four problems of the discriminant analysis. ISI 2015:1–6
Google Scholar
Shinmura S (2015e) The discrimination of microarray data (Ver. 1). Res Gate 1:1–4. 28 Oct 2015
Google Scholar
Shinmura S (2015f) Feature selection of three microarray data. Res Gate 2:1–7. 1 Nov 2015
Article Google Scholar
Shinmura S (2015g) Feature Selection of Microarray Data (3)—Shipp et al. Microarray Data. Research Gate (3), 2015: 1–11
Google Scholar
Shinmura S (2015h) Validation of feature selection (4)—Alon et al. microarray data. Res Gate (4), 2015, pp 1–11
Google Scholar
Shinmura S (2015i) Repeated feature selection method for microarray data (5). Res Gate 5:1–12. 9 Nov 2015
Google Scholar
Shinmura S (2015j) Comparison Fisher’s LDF by JMP and revised IP-OLDF by LINGO for microarray data (6). Res Gate 6:1–10. 11 Nov 2015
Google Scholar
Shinmura S (2015k) Matroska trap of feature selection method (7)—Golub et al. microarray data. Res Gate (7), 18, 2015, pp 1–14
Google Scholar
Shinmura S (2015l) Minimum Sets of Genes of Golub et al. Microarray Data (8). Research Gate (8) 1–12. 22 Nov 2015
Google Scholar
Shinmura S (2015m) Complete lists of small matroska in Shipp et al. microarray data (9). Res Gate (9), pp 1–81
Google Scholar
Shinmura S (2015n) Sixty-nine small matroska in Golub et al. microarray data (10). Res Gate, pp 1–58
Google Scholar
Shinmura S (2015o) Simple structure of Alon et al. microarray data (11). Res Gate (1.1), pp 1–34
Google Scholar
Shinmura S (2015p) Feature selection of Singh et al. microarray data (12). Res Gate (12), pp 1–89
Google Scholar
Shinmura S (2015q) Final list of small matroska in Tian et al. microarray data. Res Gate (13), pp 1–160
Google Scholar
Shinmura S (2015r) Final list of small matroska in Chiaretti et al. microarray data. Res Gate (14), pp 1–16
Google Scholar
Shinmura S (2015s) Matroska feature selection method for microarray data. Res Gate (15), pp 1–16
Google Scholar
Shinmura S (2016a) The best model of swiss banknote data. Stat Optim Inf Comput, 4:118–131. International Academic Press (ISSN: 2310-5070 (online) ISSN: 2311-004X (print), doi: 10.19139/soic.v4i2.178)
Google Scholar
Shinmura S (2016b) Matroska feature-selection method for microarray data. Biotechnology 2016:1–8
Google Scholar
Shinmura S (2016c) discriminant analysis of the linear separable data—Japanese-automobiles. J Stat Sci Appl X X:0–14
Google Scholar
Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22:231–245
Article MathSciNet Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Economics, Seikei University, Musashinoshi, Tokyo, Japan
Shuichi Shinmura

Authors

Shuichi Shinmura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuichi Shinmura .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shinmura, S. (2016). Best Model for Swiss Banknote Data. In: New Theory of Discriminant Analysis After R. Fisher. Springer, Singapore. https://doi.org/10.1007/978-981-10-2164-0_6

Download citation

DOI: https://doi.org/10.1007/978-981-10-2164-0_6
Published: 30 December 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2163-3
Online ISBN: 978-981-10-2164-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics