A Novel System for Securely Sharing Macros of Spreadsheets of Organizations

Modern spreadsheet software provide built-in script languages for developing macros which automate operations on spreadsheets. Typically a macro is stored as a part of the spreadsheet on which it is supposed to operate. Since macros can be created by spreadsheet users who have a few knowledge of computer programing, the macros are widely used by many organization workers. However, macros have the following three drawbacks. The first is overhead cost in maintaining the same macros stored in many spreadsheets. The second is incompatibility of macros among spreadsheet software of different versions and platforms. The third is security risk of macro viruses. The objective of this paper is to propose a spreadsheet macro sharing system that can solve these drawbacks. The proposed system eliminates the need of using macro-enabled spreadsheets. During editing a spreadsheet, users can import relevant macro from macro archive into their spreadsheets by macro-import add-in developed by this work. Digital signature is applied to the imported macros to confirm the source of macros and to check whether the macros have not been tampered.


INTRODUCTION
Modern spreadsheet software, such as Microsoft Excel [1], OpenOffice Calc [2], are widely used in financial activities and most workplaces today.These software are integrated with built-in scripting languages, such as VBA (Visual Basic for Applications) [3] of MS-Office Suite, and BASIC [4] of OpenOffice.org.Due to limitation of budget and human resource, many organizations can't provide and maintain software that perfectly match requirements of their workers who are working in business situations that are rapidly changing.Therefore, many organization workers use spreadsheet software and its built-in scripting language for developing programs that can automate their tasks on spreadsheets.This paper focuses on macros that are programs written by Microsoft Excel VBA because they are easily created by spreadsheet users who are not professional programmers.Furthermore, the macros are also supported to operate by other software such as Microsoft Word and Microsoft PowerPoint.
Although the macros are very useful, however they have the following three drawbacks.The first is maintainability of macros of spreadsheets stored in many computers.Typically macro code is container-bound, meaning that it forms a part of the spreadsheet on which it is supposed to operate.When users make a copy of a macro-enabled spreadsheet, the new spreadsheet created by copy operation also contains the same macro.This copy operation leads to macro code and version sprawl that is difficult to maintain.The second is macro incompatibility among Microsoft Excel of different versions.Microsoft is continually upgrading Microsoft Excel.Even though Microsoft puts great effort into compatibility between versions, many developers have discovered that the VBA code they've written doesn't work properly with older versions of Microsoft Excel.This problem forces macro developers to develop macros in separate versions that fit for operating systems and Microsoft Excel versions.As the result, macro version control and macro update are becoming inevitable overhead for the organizations that use macro-enabled spreadsheets.The third is the security problem which occurs when users open the spreadsheets embedded with malicious macro code (which is later called macro virus).Macro viruses can be spread through e-mail attachments, USB flash drives, networks, and the Internet.A wellknown example of macro virus in March 1999 was the Melissa virus [5].Since many new macro viruses may not be detected by up-to-date anti-virus software on user computers, Microsoft Excel users need other solutions to prevent them from macro virus threat.
The objective of this paper is to present a novel system for securely sharing macros of spreadsheets to solve these drawbacks of the macros.To the best of the author knowledge, there exists no previous work discussing or proposing this solution.In this system, macros are classified by spreadsheet categories such as invoice, sales weekly summation, etc.These macros are stored in macro archive of a file server.During editing a spreadsheet, a user instructs macro-import add-in developed by this work to download relevant macros from the macro archives and imports them into the spreadsheet.Digital signature is applied to the imported macros to confirm the source of the macros and to check whether the macros have not been tampered after they are digitally signed.

A. Microsoft Excel File Formats
The default file format (whose file extension is .

B. Macro Code
As shown in Fig. 1, a macro-enabled spreadsheet of Microsoft Excel consists of two compartments, one for spreadsheet-data and the other one for VBA macro data.A macro is typed in a VBA module within a project by the VBA Editor.A VBA project is a collection of VBA modules and other programming elements.A spreadsheet can contain only one VBA project.A macro module is classified into three different types: standard module (.bas), user form module (.frm) and class module (.cls).Users can create their own objects by the class module.A module can be exported from a spreadsheet as an external file which is imported into other spreadsheet.

C. Hash Functions
A hash function usually means a function used to map digital data of arbitrary size to digital data of fixed size.The input data of hash function is often called the message, and the output value of the hash function is often called the message digest or the digest.The ideal hash function has four following properties:  It is easy to compute the hash value for any given message. It is infeasible to generate a message from its message digest. It is infeasible to modify a message without changing the message digest. It is infeasible to find two different messages with the same message digest.
There are a number of different hash functions in use including Rivest's MD5 [7], which reduces a file to a 128bit message digest, and NIST's Secure Hash Algorithm (SHA3) [8], which creates a 224-bit, 256-bit, 384-bit and 512-bit message digests.

D. Digital Signatures
A digital signature is a mathematical scheme that presents the authenticity of a digital message.A valid digital signature allows a message recipient to confirm three following items.
 Sender Authentication meaning that the message was created by a known sender,  Sender Non-Repudiation meaning that the sender cannot deny having sent the message, and  Message Integrity meaning that the message was not altered in transit.Digital signatures are commonly used for software distribution, financial information interchange, and in other cases where it is important to detect forgery or tampering.Digital signatures are based on public key cryptography.Using a public key algorithm such as RSA [9], one can generate two keys that are mathematically linked: one private key and one public key.The private key is then used to encrypt the message digest.Digital signature of a message consists of the encrypted message digest and other information, such as the hash function, etc.

E. Digital Certificates
A digital certificate is a seal of approval that enables an entity (such as a person, a computer or an organization, etc.) to exchange information securely over the Internet using the public key infrastructure (PKI) [10].The main purpose of the digital certificate is to ensure that the public key contained in the certificate belongs to the entity to which the certificate was issued.Encryption techniques using public and private keys require a PKI to support the distribution and identification of public keys.A digital certificate contains a public key, used hash functions, owner or subject data, the digital signature of a Certificate Authority who has verified the subject data, and a date range during which the certificate can be considered valid.Without certificates, anyone can create a new key pair and distribute the public key, and claim that it is the public key of other person.One could send data encrypted with the private key and the public key would be used to decrypt the data, but there would be no guarantee that the data was originated by anyone in particular.All the message recipient would know is just a fact that a valid key pair was used.

III. ISSUES IN SECURELY SHARING MACROS
This section discusses issues that should be taken into account when sharing macros.macro developers need to be aware of some potential problems where the code will be run in both 32-bit and 64-bit Microsoft Excel.If the operating system of a user's computer is Macintosh, it is likely that macro developers will be forced to create a separate version of macro for Macintosh.Solution: In order to solve this problem, this paper proposes that macro developers can apply the #if, #elseif, #else, and #endif directives for controlling compilation of portions of macro codes.Furthermore, the macro developers can also use two conditional compilation constants: VBA7 and Win64.The VBA7 constant is used to ensure the backward compatibility of macro code by testing whether macro code is running by VBA7 of Microsoft Excel 2010 or its previous version.The Win64 constant is used to test whether macro code is running as 32-bit or as 64-bit.As shown in Fig. 2, the first #if directive investigates whether macro code is running in 64-bit Microsoft Excel 2010.If yes, VBA will compile only the first PtrSafe declaration which tells Microsoft Excel to replace the 32-bit interger variables to 64-bit integer variables used as pointers.Otherwise, only the second declare statement will be complied.By this way, PtrSafe is not defined in versions prior to Microsoft Excel 2010.

B. Efficient Macro Sharing
By default, when users create a macro in Microsoft Excel, the macro works only in the spreadsheet that contains it.This action is fine as long as they don't need to use that macro in other spreadsheets.If users find themselves recreating the same macros, they can copy those macros to a special spreadsheet called a personal macro workbook.The default filename of the personal macro workbook is personal.xlsb.This file is saved in in the following folder on Windows 7, 8, 8.1 and 10 computers: C:\Users\<username>\AppData\Roaming\Microsoft\Micr osoft Excel\XLSTART, where <username> denotes the name of the user that logs in her computer.Personal.xlsb is generally created when users create a macro and instruct Microsoft Excel to store it in the personal macro workbook so that the macro is available to them in each Microsoft Excel session.The macros stored in Personal.xlsbare automatically enabled by default.Although Personal.xlsbcan be used to store common macros, it has the following two drawbacks.The first drawback is that the personal workbook may be a common and easy target for macro virus attack.User's information and computer will be threatened if the personal workbook contains macro viruses.No matter the macro security setting is done, Microsoft Excel will unconditionally enable the macro virus execution after users open a spreadsheet.The second drawback is that maintenance cost of updating common macros stored in the personal workbooks of each user's computer.Solution: In order to solve the above two drawbacks, this paper proposes to remove the folders used to store the personal macro workbook from trusted locations of security center of Microsoft Excel.Instead of using the personal macro workbook, this paper proposes to use Macro-Import Add-in which was developed by this work to download shared macro from trusted file servers.By default, an add-in is not immediately available in Microsoft Excel, so users must first install it so that they can use it.Generally an add-in has the following two characteristics that set it apart from ordinary Microsoft Excel spreadsheets:  The workbook window and all worksheets in an add-in are hidden from view.The aim is that the developer of the add-in can use the worksheets to store supporting data for the add-in.Since an add-in is designed to be transparent to the user, both the code and the supporting data are hidden from the user. Add-in macros are not displayed in the Macros dialog box.Therefore add-in developers can hide them from the user.

C. Macro Virus Protection
Antivirus software is not a perfect solution for macro viruses since it can't detect many unknown viruses.As an auxiliary solution provided by Microsoft, the ability of executing macros of Microsoft Office Suite starting from Office 2007 is disabled by default.When users open a spreadsheet embedded with a macro, they are warned on the Microsoft Excel menu bar about the situation that the macro has been disabled.However, macro virus creators have already overcome this obstacle by using simple social engineering techniques to lead a user to allow the macro viruses to run [11].Solution: In order to minimize the risk of macro virus attack, this paper proposes to set macro security of Microsoft Excel to "Disable all macros without notification".Therefore, all macros in a macro-enabled spreadsheet whose file extension is .xlsm,.xlsbor .xltmwill not be executed.As the side effect of this solution, users can't execute existing normal macros of their Microsoft Excel files.In order to handle this side effect, this paper proposes to classify users' spreadsheets into categories, such as invoice, sales report, etc.This paper also proposes to design a template for generating spreadsheets of the same category.Macros which are needed in each spreadsheet category are embedded in the template.After exporting the macros of the template into macro external files, all macros are removed from the template.Therefore, the template is only used for creating a normal spreadsheet which doesn't contain macros.The macro external files of each category are stored in macro archive of users' organizations.These macro files are downloaded and imported into normal spreadsheets by the Macro-Import Add-in.

IV. ARCHITECTURE OF THE PROPOSED FRAMEWORK
As depicted in Fig. 3, the proposed framework consists of the following three subsystems.

A. Template and Macro Preparation Subsystem
For simplicity, this research assumes that a system manager (SM for short) is a person who is in charge of managing the macro-import Add-in, templates and macros used in the organization.Macro developers write macros into the templates which are provided by organization workers.Information about the spreadsheet category is written into the title field which is a file property of the template.Note that all spreadsheets created from the templates inherit the title field from the template.After the macros pass the function and security test by SM, the macros are exported to external files (macro modules) which are later compressed into a zipped file.The zipped file with its signed macro digest is stored into the macro archive.The macro digest generation will be explained in the next section.After exporting the macros, the macros are removed from the template.Then the template is stored into template archive.

B. Template Download and Macro-Import Add-in Installation Subsystem
In this process, the users firstly install the Macro-Import Add-in to their Microsoft Excel.After installing the add-in, its execution button will appear in user's addin menu of Microsoft Excel ribbon.Note that this add-in installation is done only one time on a user's computer.Secondly, the users download required templates from the template archive.In this system, all spreadsheets are stored as a normal spreadsheet whose file extension is .xlsxor .xls.Since these spreadsheets don't contain any macro, security risk from macro virus attack by opening these spreadsheets is greatly reduced.

C. Macro Verification and Import Subsystem
During editing a spreadsheet created from the provided template, the user can initialize macro by the macroimport add-in whose execution button appears in Microsoft Excel ribbon.The add-in program makes connection with the target file server.After the user passes authentication of the file server, the add-in program will specify the macros that should be download from the macro archive by considering the spreadsheet category and Microsoft Excel version.The add-in program will download a zipped file containing relevant macros and its digital signature.After passing digital signature verification (see the detail in the next section), the add-in program unzips the macro modules and import them into the spreadsheet.As a security policy, the user is allowed to store the spreadsheet as a normal spreadsheet whose file extension is .xlsx.

V. SECURITY SOLUTION FOR THE SHARED MACROS
In order to guarantee that the shared macros are safe for using, this paper proposes the following three processes: Macro Audit Control, Macro Digital Signing Process and Macro Verification Process.

A. Macro Audit Control
The goal of this process is to ensure that the macro doesn't contain any malicious code.Each macro should be carefully investigated and tested by SM or the third person who is not a macro developer before releasing it for sharing.

B. Macro Digital Signing Process
The goal of this process is to make digital signature on the macros by private key of the macro developer.Macro digest is computed from the zipped file by the MD5 algorithm which is a widely used hash function for producing a 128-bit hash value.The macro developer generates digital signature of the macro digest by encrypting the macro digest with the private key of the developer.The encrypted macro digest and digital certificate of the macro developer are submitted to SM.The zipped file of macro modules and its encrypted macro digest are stored into the macro archive.Digital certificate of the developer is stored in the certificate archive for later reference.

C. Macro Download and Verification Process
Given digital certificate, a zipped file of macro modules and encrypted macro digest, this process checks whether the zipped file of macro modules is trustful or not.The zipped macro file is trustful if all following conditions are satisfied.
(1) The given digital certificate is registered in the certificate archive, and (2) The macro digest generated by decrypting the encrypted digest by the public key stored in the digital certificate (see Fig. 5) is as same as the macro digest generated by applying the hash function to the given zipped file.If the first condition is satisfied, it denotes that the given digital certificate is of a macro developer certified by the organization.Since a digital certificate has expired date, the certificate database should be maintained daily so that it contains up-to-date information.If the second condition is satisfied, it denotes that the zipped macro file is of the certified macro developer and has not been tampered since its digital sign.After the macro-import add-in justifies that the both conditions are satisfied, the add-in program will extract all macro modules from the zipped file and will import them into the spreadsheet.

VI. RELATED WORK
Recently there have been the following two proposals for solving spreadsheets' macro container-bound problem.The first proposal is converting from Office VBA to Visual Basic.NET by Microsoft Visual Studio Tools for Office (VSTO) [12].Since the programs developed by VSTO are not stored in spreadsheets, they can be shared among spreadsheets.However, the best way to migrate VBA macro is to completely redesign the entire solution.Since this approach requires the most VBA and VSTO knowledge and the most domain knowledge, this approach is the most costly and time-consuming.The second proposal is converting from Excel spreadsheet to Google spreadsheet.Google Spreadsheet lets people to collaborate and share information over the Internet.It supports Apps Script [13] which is based on the ECMAScript version specification, and can be shared among users.However, it is difficult for many organization workers to migrate from macro enabled Microsoft Excel spreadsheets to Google Spreadsheet and Apps Scripts due to the following two reasons.The first is that there is a lot of feature incompatibility between Microsoft Excel spreadsheet and Google Spreadsheet.The second is migration cost.The migration requires good-skill programmers and much development time in converting from VBA macro code to Apps Scripts.
The following solutions have been proposed to detect and handle macro viruses.The simplest and most widely used malware detection method is signature-based method which requires forensic experts to study each malware's behavior and to update malware signatures in the database [14].Typical malware detection methods based on signatures therefore it has difficulty in detecting polymorphic viruses [15] when malwares first appear because their signatures are not yet analyzed.Microsoft Office Suite allows users to digitally sign macros.The digital signature enables a user to know that a macro comes from a trusted source and that it hasn't been modified since it was originally saved by that trusted source.This solution is realized by setting macro security to "Disable all macros except digitally signed macros" and registering digital certificates of all trusted macro creators in computers of all users.However, this method has big overhead in registering and maintaining digital certificates of all computers of organizations.Kim and Moon [16] proposed a method that uses dependency graph analysis for detecting script malwares.A script malware is represented by a dependency graph.The malware detection is transformed to the problem which finds maximum sub-graph isomorphism in that polymorphism.Ko [17] proposed a flow analysis of macro operations to determine whether the investigated macro is a malware.Based on associated values on variables, the system extracts the control and data flow from the macro, compares the flow with that of the known suspect, and measures similarity.However, there are some malwares that can't be detected by these methods.

VII. CONCLUSIONS AND FUTURE WORK
This paper has presented a novel system for securely sharing macros of spreadsheets of organizations.The shared macros are classified by spreadsheet categories.The shared macro module set of a spreadsheet category is stored as a zipped file.Digital signature is applied to the zipped file to identify the source of the zipped file and whether the zipped file has not been tampered after digitally signing.In order to guarantee that the shared macros are safe for using, this paper has proposed the following three processes: Macro Audit Control, Macro Digital Signing Process, and Macro Download and Verification Process.In order to eliminate the risk from macro virus attack, macros of macro-enabled spreadsheets are completely disabled by macro security settings.This paper has introduced a macro-import add-in that allows users to perform the macro download and verification process and to import relevant macro modules into their spreadsheets.After the users finish spreadsheet editing, the spreadsheets will be stored in the ordinary format that does not contain any macros.
The author has already developed and tested necessary program components used to confirm feasibility of the proposed system.The next task is to integrate these components into the proposed system and evaluate import time and security of the shared macros.

Figure 1 .
Figure 1.The data model of a macro enabled spreadsheet file

Figure 2 .
Figure 2.An example of using two conditional compilation constants: VBA7 and Win64.

Figure 3 .
Figure 3. Architecture of the proposed system.
A. Macro Code Incompatibility Some macro features of Microsoft Excel 2003 are not supported by Microsoft Excel 2007 or Microsoft Excel of newer versions.For intance, the Application.FileSearch property was removed in Microsoft Excel 2007.Any usage of this property in Microsoft Excel 2007 will return an error.To work around this property, users can use the FileSystemObject to recursively search directories to find specific files.Microsoft Excel 2010 is the first version that is available in 32-bit and 64-bit systems.There are two fundamental issues when users run existing macros with the 64-bit version of Office 2010:  Native 64-bit macros of Microsoft Excel 2010 cannot load 32-bit binaries.This is expected to be a common issue when the users have existing Microsoft ActiveX controls and existing add-ins. VBA previously did not have a pointer data type and because of this, macro developers used 32-bit variables to store pointers and handles.These variables now truncate 64-bit values returned by API calls when Declare statements are used.In case macros use API functions or ActiveX controls,