A neural approach to the value investing tool F-score

This work is the first neural approach to Piotroski’s (2000) F-Score. From the same informative signals, our approach based on network data envelopment analysis allows for 1) overcoming the binary perspective of classification between companies with good/bad fundamentals, and 2) appropriately assessing the existing interaction among a company’s main financial areas. The analysis of a complete sample of the largest listed companies in the Eurozone and in the U.S. market in the period 2006-2017 shows that our neural F-Score significantly improves the portfolio returns obtained by the original F-Score.

improves the success of these value strategies (e.g., Novy-Marx, 2013). Piotroski (2000) developed a fundamental score (F-Score) based on accounting signals that differentiated between companies with good and bad fundamental scores among all of those with high BM ratios. Piotroski (2000) established that official financial information was useful for the appropriate selection of these companies because 1) the companies tend to be ignored by analysts, 2) the information that companies voluntarily report to the market lacks credibility given their poor recent performance, and 3) the companies tend to be financially distressed. Piotroski (2000) showed that an annual excess return of approximately 7.50% could be obtained in the short term with respect to the rest of the companies with high BM ratios in the U.S. stock market during the period 1976-1996. These results had a great impact on both industry and academia. 1 However, the motivation of our paper is that the F-Score shows some methodological limitations that could affect the reliability of its valuations. The main limitations are as follows: 1. The binary approach of the accounting signals included in the F-Score is too simple to reflect the great variety of financial situations among the companies analyzed. It is not sufficient to determine whether the accounting signals have increased or decreased; rather, it is necessary to consider the magnitude of these variations for better scoring of the companies.
2. No interaction exists between the accounting signals identified in the F-Score; thus, no relationship exists among a company's three main financial areas, as defined by Piotroski (2000): 1) profitability; 2) leverage, liquidity, and source of funds; and 3) operating efficiency.
Our neural approach based on network data envelopment analysis (network DEA) allows for overcoming the two previous limitations, thus generating more reliable selections in the short term for companies with high BM ratios. Our neural F-Score keeps the rationale of all of the accounting signals included in the F-Score and additionally considers both their magnitude and 1 A large number of subsequent studies compared the F-Score for various markets and time horizons. Richardson et al. (2010) carried out an exhaustive review of applications of fundamental analysis to accounting signals. the interaction between them. We empirically show that the application of our neural approach to the challenging valuation problem of large-cap companies provides more reliable estimates than the use of binary and isolated signals derived from the original F-Score.
The next section of our work summarizes the F-Score and our neural F-Score. Section 3 presents the empirical analysis. Finally, Section 4 includes the main conclusions of our study.
2. The F-Score and our Neural F-Score Proposal.

The F-Score
Piotroski (2000) defined the F-Score as the sum of nine binary accounting signals that can assume a value of 0 or 1. These signals are defined and grouped into a company's three large financial areas. The F-Score takes a maximum value (minimum) equal to 9 (0) when a company presents positive (negative) scoring signals for all of the accounting signals included. ACCRUAL. Net income before extraordinary items less cash flow from operations, scaled by total assets at the beginning of year t. This signal is 1 if ACCRUAL is negative and 0 otherwise.

LEVER.
Change in the firm's debt-to-asset ratio between the end of year t and year t-1. The debt-to-asset ratio is defined as the firm's total long-term debt (including the portion of long-term debt classified as current), scaled by average total assets. This signal is 1 if LEVER is negative and 0 otherwise.
LIQUID. Change in the firm's current ratio between the end of year t and year t-1. The current ratio is defined as total current assets divided by total current liabilities. This signal is 1 if LIQUID is positive and 0 otherwise.
EQ_OFFER. This signal is 1 if the firm did not issue common equity in the year preceding portfolio formation and 0 otherwise.

Area 3 -Operating Efficiency
MARGIN. Gross margin (net sales less cost of goods sold) for the year preceding portfolio formation, scaled by net sales for the year, less the firm's gross margin (scaled by net sales) from year t-1. This signal is 1 if MARGIN is positive and 0 otherwise.

TURN.
Change in the firm's asset turnover ratio between the end of year t and year t-1. The asset turnover ratio is defined as net sales scaled by average total assets for the year. This signal is 1 if TURN is positive and 0 otherwise.

Our Neural Approach to the F-Score (NF-Score)
Figure 1 reflects the structure of our neural F-Score (NF-Score). According to Kao (2014), the NF-Score is a mixed structure, which is the combination of a parallel and a series structure. The NF-Score includes nine signals similar to those of the F-Score, but they are evaluated in terms of the output obtained versus the input consumed. That is, the NF-Score does not consider these signals in binary terms (1,0) but rather assigns them a score as a function of the existing output/input relationship of these signals. These nine signals are grouped in the same three financial areas defined by Piotroski (2000), but our neural approach interconnects them through some of the signals defined as links. These include what is considered to be an output for a signal of one financial area and what is deemed an input for a signal from another area. After defining our neural structure in Figure 1, we describe a suitable procedure for modeling it. We work with n companies (j = 1, …, n) consisting of nine signals (k = 1, …, 9). Let m k and r k be the numbers of the inputs and outputs to signal k, respectively. The link from signal k to signal h is denoted by (k,h), and the set of links by L. The inputs of company j at signal k are {x j k ∈ R + m k }, and the outputs of company j at signal k are { j k ∈ R + r k }, where (j=1,…, n; k= 1, …, 9).
The link variables from signal k to signal h are {z j is the number of intermediate inputs and outputs in link (k,h).  k ∈ R + n is the intensity vector of signal k, and s k+ and s kare the nonnegative slack vectors of input excesses and output shortfalls, We assume a variable returns-to-scale (VRS) hypothesis because it better evaluates signals when not all of the companies operate at the optimal scale. Thus, production possibility set P is spanned by the following convex hull of the existing companies.
Based on the original slacks-based model (SBM) 3 of Tone (2001) where P k is the set of signals with the link of (f,k)∈ L (antecessor of signal k), t (f,k) is the number of intermediate inputs and outputs in that link, F k is the set of signals with the link of (k,h)∈ L

(successor of signal k), and t (k,h) is the number of intermediate inputs and outputs in that link.
A company will be positively evaluated overall in model [3] when optimal slacks (s k-* ,s k+* ) together with optimal intermediate input and output slacks (s (f,k) -* ,s (k,h)+* ) result in NF-Score = 1.

Data
At the end of each year between 2006 and 2017, we identified all of the companies that were listed in euros (dollars) and that were part of the FTSE Eurofirst 300 (Standard & Poor's 500) as the representative benchmarks of the Eurozone and the U.S. stock market, respectively.
Following the usual practice in these studies, we excluded the financial sector. Thus, we worked with a complete sample of the largest nonfinancial companies in the world's most economically relevant stock markets. The choice of this challenging sample to validate our neural model is justified by the difficulty of finding successful value strategies in large companies with rapid information-dissemination environments (Piotroski, 2000).
Finally, our sample consisted of 1,678 (5,018) company observations, which means an average number of 140 (418) large-cap companies analyzed each year in the Eurozone (U.S.) market. The accounting information and the daily historical quotes of each company were obtained from Datastream-Thomson Reuters. We excluded approximately 3.5% of company observations due to the unavailability of data.

Empirical Results
We developed an empirical analysis similar to that of Piotroski (2000). For each year, we selected companies with BM ratios higher than the median. 5 The final sample consisted of 922 (2,266) observations of the large companies most undervalued in the Eurozone (U.S.) market from 2006-2017.
Next, we selected companies each year with a maximum F-score of 9 and a minimum Fscore of 0 according to equation [1]. 6 Subsequently, we applied our NF-Score model [3] to the same annual sample of companies. The NF values obtained each year were grouped into 10 clusters based on the k-medoids technique. Therefore, each year, we obtained a score for each company assigned by the original F-Score of Piotroski (2000) between 0 and 9. Likewise, we also obtained an NF-Score between 0 and 9, which refers to the 10 clusters obtained from our neural model.
Next, we obtained the equally weighted returns of the portfolios formed by the best and worst rated companies by both models. These returns were calculated as one-and two-year buyand-hold returns earned from the beginning of the second quarter of the year following the selection of the companies. Similarly, the equally weighted returns of the portfolios formed by all of the companies, excluding the highest rated companies, were computed.  Piotroski (2000), thereby questioning again the successful application of the F-Score in large-cap companies that are extensively followed by worldwide analysts.
5 Piotroski (2000) used the top quintile because his sample was much larger as a result of not limiting the size of the company. 6 For those years in which maximum and/or minimum scores were not available for a company, the adjacent scores were taken as a reference.
However, Panel B shows that the NF-Score reliably selects high-BM companies, especially in the Eurozone market, where the portfolios formed by companies with good fundamentals obtained a 3.24% annual excess return at one year compared with companies with poor fundamentals. However, these results are much less successful for the U.S. market, where the rapid information-dissemination environment seems to severely restrict the utility of the analysis of the nine accounting signals (Piotroski, 2000) with respect to the Eurozone. This finding could be explained by the higher levels of analysts' coverage of large-cap companies in the U.S. market than in the Eurozone. Bolliger (2004) finds that forecast accuracy is negatively associated with the number of countries covered by the analysts. In addition, the significant number of chartered analysts in the U.S. market (CFA, 2018) together with local analyst advantages (Bae et al., 2008) question the importance of the analysis of basic accounting signals in U.S. large-cap companies.
Finally, Panel C shows the best selection skills of the NF-Score with respect to the original F-Score. The company selection with good fundamentals from our NF-Score significantly exceeds the returns obtained by the F-Score selection in both the Eurozone and U.S.
markets. This better selection is also evident, especially in the U.S. market, in which the worst fundamental companies chosen by the NF-Score obtained significantly lower returns than those selected by the F-Score. All of the results are robust for one-and two-year investment periods.
The interaction between the financial areas of a company through our NF-Score allowed us to obtain short-term annual excess returns of approximately 4% in long-short value strategies over large-cap Eurozone companies that were rated by the binary F-Score model. This improvement is even more relevant in the U.S. market, with a range of annual excess returns from 7.62% to 10.11% for one-and two-year investment periods, respectively.

Conclusions.
This work is the first neural approach to the F-Score model proposed by Piotroski (2000).
Using the same accounting signals, our model allows for 1) overcoming the binary classification perspective between companies with good/bad fundamentals and 2) appropriately assessing the existing interaction among a company's main financial areas.
the Eurozone and the U.S. market during the period 2006-2017 showed that our NF-Score significantly improves the short-term returns of long-short value strategies selected by the F-Score. However, the nine accounting signals proposed by Piotroski (2000) are not sufficient to get winner returns for the U.S. large-cap companies. The first implication of these results is that the interaction between the financial areas in large-cap companies should be considered in further value strategies to obtain higher levels of performance. Second, further neural valuation models should include more sophisticated accounting signals to identify winner value strategies in rapid information-dissemination environments with high levels of analysts' coverage. Both implications define the avenues for future research in other markets and investment strategies.