Data on accessibility of corporate information and business transparency in Russia

Empirically based, the data description covers business transparency in Russia and availability of corporate information for interested parties. Entry barriers in the form of a fee for hard copies of documents are perceptible as an important indicator of business publicity. The study made it possible to summarize current data on 5070 Russian enterprises in order to estimate document copying cost differentiation according to the developed model. The sample size made it also possible to ensure high levels of data quality and representativeness. Actual limiting average mean error Δp was 3.47% with 99% of study validity. The analysis relied on regional and sectorial data groupings to show a strength of various impact factors. In view of this, correlation coefficients, average and weighted average cost values, and descriptive statistics became secondary indicators. The cost value distributed along an interval scale is a major empirical result of the research. The examination of the obtained data makes it possible to identify an availability level of corporate information for various stakeholders and the general public. This is a part of civil right enforcement in the field of information control and validity check. Conjugated scientific issues include pricing of non-core services of companies, corporate relations and modelling of market behaviours. By making a representative data set, authors make an effort to fill the fact-based gap available in other disciplines and related to business transparency in Russia.


a b s t r a c t
Empirically based, the data description covers business transparency in Russia and availability of corporate information for interested parties. Entry barriers in the form of a fee for hard copies of documents are perceptible as an important indicator of business publicity. The study made it possible to summarize current data on 5070 Russian enterprises in order to estimate document copying cost differentiation according to the developed model. The sample size made it also possible to ensure high levels of data quality and representativeness. Actual limiting average mean error Δp was 3.47% with 99% of study validity. The analysis relied on regional and sectorial data groupings to show a strength of various impact factors. In view of this, correlation coefficients, average and weighted average cost values, and descriptive statistics became secondary indicators. The cost value distributed along an interval scale is a major empirical result of the research. The examination of the obtained data makes it possible to identify an availability level of corporate information for various stakeholders and the general public. This is a part of civil right enforcement in the field of information control and validity check. Conjugated scientific issues include pricing of non-core services of companies, corporate

Subject area
Economics More specific subject area Companies, economics of corporate relations Type of data

Value of the data
• It is possible to use the data for an alternative assessment of business transparency and barriers when accessing corporate information. • The data make it possible to consider extra factors in research models of regional and sectorial differentiation between Russian companies. • The data will be useful for associated research on pricing for non-core services of companies and on market behaviour modelling. • It is possible to use the data for an analysis of a level of a direct interaction of Russian enterprises with stakeholders. • The data and techniques make it possible for other scholars to replicate and expand the analysis and check results of similar empirical studies.

Data
The exposure of research data is a set of entries (which authors have put together in compliance with the empirical model) as of June 2018 about Russian companies and costs that stakeholders incur when request hard copies of the corporate documents. Financial indicators have supplemented the corporate information making it possible to correlate and calculate weighted average values for the cost. For the analysis, authors used key financial indicators of enterprises for the previous reporting year (2016) taken from the SPARK-Interfax Information Agency [1]. They include share capital, net assets, return on sales, net profit margin (ROS), gross profit margin, assets, equity, revenue, and net profit/loss. Preliminary estimates of sample coverage by internal classification criteria (region and unstructured sectors on the register of the Interfax Corporate Information Disclosure Centre [2]) had showed strong differentiation and authors made changes to the list of classification criteria. Federal districts were a regional distribution unit. For a presentation of the data exposure, authors substituted unstructured sectors with aggregative sectors according to the Russian Classifier of Economic Activities (OKVED) [3]. Authors have assigned the sectors with poorly presented companies to "others" [sectors], in order to have higher representativeness of estimated levels of corporate information accessibility.
The processed data in the exposure have a form of tables and graphs. Graphs (Figs. 1 and 2) show standard and Pareto distributions of the usual cost for hard copy making for documents if requested by stakeholders for all the companies, regardless of their business and incorporation place.

Study area
The empirical study was organised from the entire assembly, i.e. enterprises that are on the register of the agency accredited by the Bank of Russia for corporate information disclosure. The Interfax Corporate Information Disclosure Centre [2] was a chosen source of raw data. The register included public and privately held companies in the Russian market of securities: joint-stock companies (widely), limited liability companies (narrowly) and state-owned corporations (narrowly). The register included both active companies, and the enterprises that had already ceased their operations. Sampling included several criteria and stages.
Decisions on a composition of the entire assembly were as of the data collection time. As of June 2018, the registry contained 36,798 non-repeated entries on enterprises with various degrees of activism. Herewith, making the entire assembly, authors only reviewed enterprises incorporated in Russia. Authors did not take into account the overseas enterprises that had disclosed corporate information in the Russian market of securities. This information on the distribution by Russian   Note: 0.00means that documents are provided without a fee (free of charge); 4.00is a cost median in Russia; 5.00is a cost mode in Russia; 10.91is a weighted average cost by a number of events of Russian companies; 12.88is a weighted average cost by a number of documents and events of Russian companies; 15.79is a weighted average cost by a number of documents of Russian companies; 22.96is an average cost in Russia; Section A -Agriculture, forestry, hunting, fishing and fish farming; Section B -Mineral extraction; Section C -Manufacturing; Section D -Supply of power, gas and steam, air conditioning; Section F -Construction; Section G -Wholesale and retail trade, motor vehicle and motorcycle maintenance; Section H -Transportation and storage; Section J -Activities in the field of information and communications; Section K -Financial and insurance business; Section L -Real estate; Section M -Vocational, academic, and technical careers; Others (Section E -Water supply, water disposal, organization of waste collection and disposal, pollution elimination, Section I -Hotels and public catering enterprises, Section N -Administration and related supplementary services, Section O -Public administration and military safety and security and social care, Section P -Education, Section Q -Public health and social support, Section R -Culture, sports, leisure, and entertainment arrangements, Section S -Provision of other types of services). regions and sectors served as a basis for the sample quality assessment at sampling stages. The data for a final analysis were an exposure of the effort in compliance with the empirical model.

Sample
To assess a quality of samples, we calculated a limiting error for the observations made at various sampling stages that had preceded the acquisition of the research data exposure. We calculated the limiting error for confidence levels of 90%, 95%, and 99%. In total, sampling included three stages according to the empirical model and resulted in the sample size of 9366, 6555, and 5070 companies respectively. At stages of the empirical model, the coverage was not less than 13.78% as compared to the entire assembly from the registry of companies (excluding the criteria of activism) with samples. Such a coverage allowed low values of the actual limiting error (Table 5), as well as average, minimum, and maximum limiting error among internal classification criteria in the sample ( Table 6). As classification criteria, we had chosen regions of Russia (subjects and federal districts) as an incorporation place, as well as unstructured sectors (according to the scale of the Interfax Corporate Information Disclosure Centre [2]). In 99% of cases, the low limiting error of 2.55-3.47% for the confidence P level points out to representativeness of the sample and the empirical study in progress.
As for criteria of internal classification, in the data exposure, companies' presentation is nonhomogeneous. The value range spread of the specific share of companies in unstructured sectors is 31.5 percentage points, from 0.2% (insurance companies) to 31.7% (other companies). There are similar values obvious for regions (federal districts) of Russia, among which, the Central Federal District has the highest representation with a specific share of 29.4% of companies in the exposure, while the Crimean Federal District is the least representative one with its 0.4%. There is a shift in the structure of companies in the sample at the third stage of sampling. This makes it difficult to have a cluster analysis by criteria of internal classification. This limitation does not apply to the general analysis of distribution.

Empirical models
The author's research approach includes a distribution analysis of cost for hard copying of corporate documents for stakeholders. Our goal is the identification of barriers that are in place when we offline access such the information and tracing trends towards changes in the development of corporate ethical obligations and business practices in Russia in terms of business transparency. The empirical model assumes consecutive sampling of data that meet the given parameters.
At the first stage, we collected initial data on companies from the registry of the Interfax Corporate Information Disclosure Centre [2] with a technical restriction of a recall ratio in a pair of classification criteria "regionindustry" with no more than 400 entries. An additional condition for sampling is an active status of a company implying that in January-June, 2018, it posted/published at least one document or event. Based on these criteria, the first sample included 9,366 companies in total. The second stage of sampling focuses on stated cost for making hard (printed) copies of requested documents, as well as banking details for payments. This data unit is a core in the empirical research. Enterprises that had not submitted such a statement were out of the sample. Statements had not been structured or in any way standardized, so their processing was manual with a random check (which might lead to a slightly higher error percent in estimates). Based on sampling results at the stage, the completed intermediate sample included 6,555 companies.
The final third stage of sampling presents the research data exposure. The sampling principle relied on incomplete statements. The sample did not include the enterprises that had not clearly referred to costs for document hard copy making. We mean statements that only contain banking details, links to company's rates posted on corporate sites or other resources except for a data collection service, or links to rates targeted at deleted documents and pages. The sampling resulted in a database of 5070 companies, for which scholars later held a distribution analysis.
The data exposure has a number of research limitations and assumptions. This is mainly due to pricing for making hard copies of documents. This very criterion is the basis of a comparison and analysis. In our research, the cost is for a page of a document in the national currency. A simulated case has the following description: a) Consumer is an individual, not affiliated with a company and not a shareholder, b) Requested document is one-paged, c) Copy is in A4 format, one sided, black and white, with no duplication, d) Rates are for copying within 7 days of a request day.
Note that Russian legislation provides for shorter timeframes for provision of documents to shareholders about general meetings of company's participants (up to 5 days). At the same time, Note: Pis a reliability/accuracy, Δpis a limiting error; * Is 85 subjects of the Russian Federation; ** They used unstructured sectors according to the registry of companies of the Interfax Corporate Information Disclosure Centre [2] (banks; pulp, paper and woodworking industry; investment companies; light industry; mechanic engineering; research institutions; oil and gas production and refining; food industry; production of construction materials; communications; agriculture; insurance companies; construction; fuel industry; trade; transportation; services; chemical industry; holdings and property management companies; ferrous and non-ferrous industry; power sector; other); *** Stands for nine federal district of the Russian Federation.
it allows extending the standard period for 20 days if there are more than 10 requested documents or their volume is over 200 pages. At its discretion, an enterprise might make a limited number of documents and their volume for delivery higher in an extended period of time, but must fix it with the Charter or other in-house record of a joint-stock company), e) Due to specifics of the Russian tax legislation, the cost for document copying includes VAT if a company's statement contains a proviso saying that there is a need in its payment. In all the other cases, the cost is shown as is, f) Companies' statements on fees for document copying were published on different days, but we consider them valid and up-to-date (this assumption may lead to a slightly higher error percentage in estimates if companies have not submitted their updated statements to accredited institution "Interfax Centre for Corporate Information Disclosure" [2]), g) In the research, authors have ignored the fact that in case of approved corporate rates in place for hard copying, in some cases hard copies are free of charge. This might depend on a subjective managerial decision, when a volume of requested documents is very limited and an aggregate value (to be charged) is insignificant.