STEGANOGRAPHIC ALGORITHM FOR INFORMATION HIDING USING SCALABLE VECTOR GRAPHICS IMAGES

Besides cryptography, the great attention is paid to the steganography, which is considered not only the science but also the art of the concealed communication. In steganography, the information is hidden within another piece of information, called stegomedia. This paper presents a new data hiding technique based on steganographic algorithms that is hiding information into vector images. The paper briefly introduces this technique and evaluates the benefits and drawbacks of the proposed approach.


INTRODUCTION
There has been an explosive growth of digital multimedia, communication and computer applications during the last decades.Wide distribution of digital information is related with easy data modification and duplication.The need to secure information and personal data is gaining importance not just in the military but also in our everyday lives.
Data hiding techniques received less attention from academic community than cryptography, but the situation has changed at the beginning of the 90's.The first academic conference about the subject of data hiding was held in 1996.Since then many other conferences took place followed by many articles in periodicals and journals.
It is often assumed that a communication channel is secure once cryptography is applied on the waging communication.This does not have to be always true, as the selected cryptography technique can have vulnerable points, for example unknown bugs in the protocol, continual increase in calculation speed of modern computers and related brute force methods, password phishing attacks, etc. Steganography, in contrast to cryptography, conceals the fact that two sides are exchanging information.If used along with cryptography, it hampers the attacker's efforts.Attacker has to discover if the communication is taking place before he can try to break the communication protocol.
The topic of this work is to present an overview of the modern steganography techniques, to classify the proposed steganography algorithm into the data hiding hierarchy and to analyse the benefits and drawbacks of the proposed algorithm based on defined criteria.The goal of this work is to define a new approach for hiding secret messages into Scalable Vector Graphics (SVG) images, aiming to create minimum modifications to the vector cover image.
If steganography is the art of hiding information into other information, then the art of detecting secret messages embedded using steganography is called steganalysis.The aim of steganalysis is to determine if a cover message contains the secret message and is based mostly on statistical methods.Compares the statistical deviation of what the cover image looks like and how it should look like and tries to predict the length of the secret message.
The less the cover image has been modified, the harder it is for steganalysis to determine if any secret communication is taking place and the steganographic method can be considered more successful.The design of the proposed algorithm concentrates mainly on this fact.

SUBJECT
Steganography is a part of techniques that specializes in information hiding (Fig. 1) [1].Covert channels are defined as any kind of data transportation method, that was originally not intended for this purpose [2] [3].
Watermarking is in some aspects similar to steganography, but in contrast to steganography it needs to be robust, ergo it should be insusceptible to manipulation or render the manipulated data useless in the process of destroying the watermark [4][5] [6].

Fig. 1 Classification of information hiding techniques
Steganography can be further classified based on the chosen classification criteria.Classical steganography includes steganographic methods used before the computer era (Fig. 2) [7].
Linguistic steganography consists of methods that conceal messages into natural human language, commonly using cover media in the form of written text [8][9] [10].Examples are including acrostics, Javanais, Piglatin, semagrams which hide the message into small graphical details, cues, etc.

Fig. 2 Classification of classical steganographic methods
Technical steganography hides the message with the help of scientific methods using various types of cover mediums [7].Examples include the German microdot and invisible ink.
Modern steganographic techniques use the advantages of computer systems.They can be divided based on the type of cover medium used, for example images, audio, video or text.Image steganography can be performed using raster or vector cover images.The basic type of raster image steganography is the LSB steganography, which hides the message into the least significant bits of the raster cover image.The changes are so small that they are invisible to the naked eye.
Vector image steganography methods can be divided into jittering and embedding [11] [12].Jittering is similar to LSB steganography, due to storing the secret message into the least significant digits of point coordinates in vector images.
Embedding stores the secret message into extra image points generated directly on other graphical objects.For example if there is a line consisting of two points, extra point can be put directly on the line without changing the visual form of the line (Fig. 3) [12].

Fig. 3 Message hidden using embedding algorithm
The algorithm creates a collection of points along a line starting with a reference point.The length of the reference point is compared to the lengths of other points.The same length may represent the bit 0 and two times the length bit 1.It is possible to encode for example 8 bits in one extra image point by introducing 256 different lengths, etc.The algorithm is more robust than jittering by being resistant to modifications such as scaling, moving and rotation.
A new algorithm, proposed in this paper for vector steganography is based on the embedding algorithm (Fig. 4).The proposed algorithm transforms the secret message from byte array to a stream of decadic digits.The stream is divided into the smaller parts that are labeled m and are based on the precision of numbers in SVG vector image.The secret message can be transformed to the digit stream in different ways, depending on the chosen encoding method.In order to determine the most efficient encoding, with the least number of output digits generated, a program designed as the part of this work has been implemented and tested on a 12MB of random data secret message (see Table 1).
Bits column represents the number of bits to remove from the secret message in one step and digits column represents the number of decadic digits created.Extra bit is the chance to take one extra bit from the secrete message byte array.For example the 3 bit encoding needs to encode the numbers from 2 to 7 using three bits, but numbers 0, 1, 8 and 9 can be encoded using four bits, which shortens the message by one bit.
Total removes is the total number of steps it took to transform the whole message and total digits is the number of generated decadic digits.Not all bit encodings make sense, for example 1 and 2 bit encodings always generate more digits than 3 bit encoding, because they take less extra bits each step.
The shortest digit stream has been generated using the 63 bit encoding.Prolongation column represents the extension of the secret message, if it has been encoded using some other type of encoding.

SOLUTION AND RESULTS
This section focuses on the implementation of the proposed solution and summarizes attained results.

The design of the vector steganography algorithm
The proposed algorithm begins by preparing the secret message for encoding (Fig. 5).The message is transformed from byte array to a digit stream using the 63 bit encoding.The algorithm takes each step 63 or 64 bites and turns them into 19 decadic digits.The last 63 or less bits are an exception, since they can be turned into 20 down to 2 digits, depending on the minimum number of bits required to encode the number (the one extra digit is used for decoding purposes).The algorithm cycles through all the polygons of the vector cover image and gets all point coordinates for each polygon.If the coordinates are in an unsupported format, the polygon is being ignored.Otherwise it checks if the first supported polygon has finally been found and if yes, it initialises its first point with metadata.Next the polygon needs to be prepared.The algorithm encodes the points containing secret message on other vector objects, so there must not be any extra points on vector objects already present in the image before encoding.After the message encoding there needs to be precision correction, because the floating point coordinates of the extra points are always rounded based on the SVG standard.

Evaluation of the proposed algortihm
The proposed algorithm has been tested on random input messages of varying length and compared to the default algorithm, which uses the 8 bit encoding without using the extra bit.The results show that the proposed algorithm allows shortening of the secret message after encoding by up to 20% (Table 2).

CONCLUSIONS
The proposed vector steganography algorithm shortens the input secret message after encoding, thanks to the effective conversion of message bytes into decadic digit stream, resulting in less extra points generated into vector images, making the images less suspicious and more secure.The algorithm uses 63 bit encoding, which is the limitation of fundamental integral types in the C++ language.It is interesting to research more effective message conversions using higher bit variables.Additional place for research is in the improvement of the precision correction with the goal of finding the most number of digits that may be encoded into a single point manipulating its coordinates accordingly.Eventually, the algorithm's security may be improved by placing the extra points onto other vector objects, for example using Beezier curves.

Fig. 4
Fig.4 Message hidden using embedding algorithm The x and y coordinates are calculated using simple trigonometry and rounded based on the chosen precision.The value of the message m stored in point E can be restored using the values x, y and the Pythagorean Theorem.The secret message can be transformed to the digit stream in different ways, depending on the chosen encoding method.In order to determine the most efficient encoding, with the least number of output digits generated, a program designed as the part of this work has been implemented and tested on a 12MB of random data secret message (see Table1).Bits column represents the number of bits to remove from the secret message in one step and digits column represents the number of decadic digits created.Extra bit is the chance to take one extra bit from the secrete message byte array.For example the 3 bit encoding needs to encode the numbers from 2 to 7 using three bits, but numbers 0, 1, 8 and 9 can be encoded using four bits, which shortens the message by one bit.Total removes is the total number of steps it took to transform the whole message and total digits is the number of generated decadic digits.Not all bit encodings make sense, for example 1 and 2 bit encodings always generate more digits than 3 bit encoding, because they take less extra bits each step.The shortest digit stream has been generated using the 63 bit encoding.Prolongation column represents the extension of the secret message, if it has been encoded using some other type of encoding.

Table 2
Number of generated digits comparison Ján Hurtuk (Ing.) was born on 4th October 1988 in Kežmarok.In 2013 he graduated (MSc.) at the Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, TUKE.Since 2014 he is studying as a PhD.student at the Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Košice.Marek Čopjak (Ing.) was born on 29th June 1987 in Kežmarok.In 2012 he graduated (MSc.) at the department of Cybernetics and Artificial Intelligence of the Faculty of Electrical Engineering and Informatics at Technical University in Košice.Since 2013 he is studying as PhD.Student at the department of Computers and Informatics at the Faculty of Electrical Engineering and Informatics the Technical University of Košice.Peter Hamaš (Ing.) was born in Košice, Slovakia, in 1990.He received his master's degree in Informatics in 2014 from Faculty of Electrical Engineering and Informatics, Technical University of Košice.His research is focusing on the methods of concealing messages using data hiding and steganography.Since 2014 he is working as a programmer specializing in CAD systems for Ekosoft s.r.o., Košice.Michal Ennert (Ing.) was born on 4 th August 1987 in Revúca, Slovakia.In 2011 he graduated at the Department of Computers and Informatics of the Faculty of Electrical Engineering and Informatics at the Technical University of Košice and received the engineering degree.Since 2011 he is PhD.student.He is doing research and experiments mainly in the field of computer security with usage of GPGPU technology and in the field of distributed software architecture.