Interactive Multimedia-Based Educational System for Children Using Interactive Book with Augmented Reality

: This paper describes a system's model for Augmented Reality Learning in which a traditional book is converted to an interactive book using Glyphs (TAGs) and multimedia. The interactive book can be used by a child, parent, or by a teacher to make learning an enjoyable experience. As the child goes through the contents of the book, illustrations and images come to live, thus enforcing the learning and comprehension of concepts in an interactive and fun way. To make a printed book interactive, special TAGs (Glyphs) are inserted in the required places within the book, ready to be read by the webcam and then converted to video, 2-D or 3-D images, audio and explanation text. An actual example (Sandy Starfish) is presented to illustrate the architecture and the implementation of the Augmented Reality learning system and to explain the steps and procedure used to transform a textbook to an interactive one.


Introduction
Improving education has been the focus of several efforts around the world.Each of these efforts has different roots and provides different models for changing teaching and learning (Issa et al., 2014a;2014b;El-Ghalayini et al., 2017;Black and Atkin, 1996).Over the past couple of decades, the usage of digital devices and other interactive media applications such as interactive books and video games has been added to the list of learning tools in preschool classrooms (Romano et al., 2007).Issa et al. (2010) proposed a framework of interactive satellite TV-based M-learning for constructing a cost-effective, reliable M-Learning system that can be used by large scale of learners and teachers.
The Federation of American Scientists ( 2006) supports the use of digital devices and video games in classrooms stating that, "educational games are fundamentally different than prevailing instruction because they're based on challenge, reward, learning through doing and guided discovery in contrast to the 'tell and test' methods of traditional instruction" (Romano et al., 2007).
Many research papers focus on the teaching methods for such interactive environment in order to state the standards for the best way to evaluate students' ability to be interactively involved in the class.This seems to be necessary with the distraction of connectivity environments; such as the Internet and its different applications.This can be achieved by encouraging the students to use these applications and tools in their education alongside their entertainment or social activities (Nusir et al., 2013).Another study in (Tervakari et al., 2013) concentrated on studying the effect of using the social media on enhancing students learning and cooperation.83% of the students agreed on that using social media enhanced their learning.
For the purpose of assisting teachers in determining which IT tool to use regarding time constraints of students and teachers (Lin et al., 2014) proposed an evaluating model: Analytic Hierarchy Process (AHP)-Multi-Choice Goal Programming (MCGP), the model was implemented and used in computer courses at a university in central Taiwan.
Currently, learning is being highly affected by the emergent cultures of the 21st century, such as 'blogging', file-sharing gaming and Socializing.Attitudes of learners have shifted towards openness, self-paced learning, teamwork and have shifted away from the traditional just in class learning, Accordingly, new technologies have to be combined with traditional methods to attract learners (Tang et al., 2009;Connolly et al., 2009).
Entertainment-driven learning that is more dependent on meaningful activities and exercises defined in the game context is has become a more attractive form of learning than traditional didactic approaches.Specifically, if these Entertainment-driven learning approaches are based on the gaming technologies to create a fun, motivating and interactive virtual learning environment (Connolly et al., 2009).
One of the most studied tools in enhancing teaching techniques is using Augmented Reality (AR) to build interactive learning environments and interactive books.An audio-visual experience is given by (Grasset et al., 2008) through building a well-known book in New Zealand "The House that Jack Built".The system offered alternative options for users to navigate through the book using a handheld device attached with a camera or a handheld camera controlled by the user through computer screen.
Although the work in (Grasset et al., 2008) concentrates on using AR in teaching environments yet many research areas take the benefit of AR in other domains such as in (Nilsson et al., 2011) where an AR system was developed to help in the cooperation between multiple parties to accomplish joint tasks.For example, in crisis management scenario where rescue services, the police and military personnel by providing organization specific views of a shared map.
AR is being the core of many researches in several domains such as tracking features or objects, hybrid objects tracking systems using mobile wearable AR system that combines vision-based tracker with inertial tracker (Ribo et al., 2002), while in (Wagner et al., 2010) 6DOF natural feature tracking techniques were presented and they are suitable for phone applications.Enhancing museum visitors' experience system was introduced in (Barry et al., 2012) where Natural History Museum, London a multimedia studio, through using AR technologies the visitors had a dynamic experience using tablets and mobile device.Hagbi et al. (2009) is another mobile phone system that uses AR to track features, where users can teach the system new shapes in real-time (Liu and Granier, 2012).Outdoor illumination variations online tracking from videos captured with moving cameras techniques were developed for AR.Two gaming systems were developed in (Ebling and Caceres, 2010) to encourage people to get involved in their cities.
This paper describes the development of an interactive book using augmented reality tools which can be easily adopted by early grades teaching systems.The rest of the paper is organized as follows: Section 2 provides an overview of augmented reality in education and sheds some light on existing related technologies.Section 3 describes the proposed framework and its main features.Section 4 explains the procedure for user interaction.Section 5 describes the implementation details.Finally, the conclusions are summarized in Section 6.

Augmented Reality in Education
Augmented Reality, on the other hand, is a variation of Virtual Reality (VR) where participants interact with digital information embedded within physical real environment (Dunleavy and Dede, 2014).VR technology simulates real world, with virtual objects existed in the real world.Therefore, AR completes reality, rather than replacing it and virtual and real objects will seem as existing together in the real world from user's perspectives.AR can be thought of as a middle layer between VR which is completely artificial and the physical world which is completely realistic (Azuma, 1997).
Many systematic literature reviews in the area of AR in education have concluded that there is clear advantage of using AR in promoting enhanced learning achievement, despite some of noted difficulties in the usage of AR and some of the technical issues related to this technology.Many benefits have been highlighted in the research including: Improve learning at all levels, in all places and for people of all backgrounds; No special equipment is required; Higher student engagement and interest; Improved collaboration capabilities; A faster and more effective learning process; Practical learning; Safe and efficient workplace training (Yuen et al., 2011), (Wu et al., 2013), (Radu, 2014), (Bacca et al., 2014), (Akçayır and Akçayır, 2017), (Chen et al., 2017).(Masmuzidin and Aziz, 2018).(Sommerauer and Müller, 2018).
Existing AR Systems in Education and Learning Marshall (2005), has shown that people and students are not willing yet to give up the physicality of a real book and hard copied material because they offer a broad range of advantages, such as: Portability, flexibility, robustness, etc. therefore and as shown by (Marshall, 2005) the future of physical books can be changed and enhanced considering the individual reader.Books can be digitally enhanced to combine the benefits of physical books with the benefits of having new interaction possibilities offered by digital media.Having the suitable combination of pedagogical structure and new technologies should enhance learning outcomes (Shirazi and Behzadan, 2013;Bower et al., 2014).Billinghurst et al. (2001;Billinghurst and Duenser, 2012) developed a magic book which enables users to see virtual content imposed over real book pages in an AR view and then transition into a virtual reality view to experience a fully emotional view of the data (Grasset et al., 2008).This methodology in AR depends on features extraction and requires a great deal of customization skills to make any hard copy material to become augmented; this proposed solution only visualizes the Objects without interaction or searching techniques.In addition, visualization was not built on a well-defined framework.Cooperstock (2001) has developed a presentertracking algorithm, which follows the instructor's movements, even when in front of a projected video screen, thereby obviating the need for a professional camera operator.The system then compresses the captured video and upload it to e-learning portal.This algorithm tracks the motion of instructor but not the material itself.Incoming video stream is rendered not in real time and may be time consuming since the algorithm depends only on feature analysis.Al-Wabil et al. (2010) proposed an educational game for Children with dyslexia and attention deficit disorders who often suffer from problems with short term memory.The systems use a learning strategy to enhance short term memory even though it does not rely on using a physical book.
Using AR has been also used to teach students in different levels, such as building three-dimensional dynamic geometric objects rather than depending on the traditional methods.Al-Wabil et al. (2010) developed an AR system that enables students and teachers to work on 3d-Objects which enhances students learning of complex spatial problems.Liarokapis et al. (2004) students explore 3D mechanical engineering materials through 3dWeb application.The system architecture includes a XML repository, communication server and a visualization client's system which are built with the help of AR environment.Students could interact with machines, vehicles, platonic solids and tools to enhance mechanical engineering teaching methods.

A Model for Learning Using Augmented Reality
This section presents a system called TAGtech, which constitutes a general model for using Augmented Reality with physical books.Figure 1 shows the active players in such a system: Virtual environment (tag Info, media streams and audio), supported hardware (cameras and data shows), physical contents (a book and tags images) and educational environment represented by the teacher.All these components together form the interactive book, which will be used to enhance overall student's learning.

Overview of System's Operation
TAGtech has been designed to act as a platform for converting traditional books into interactive books that uses Augmented Reality concepts.This platform has been designed bearing in mind people with little or no technical backgrounds.A kindergarten teacher for example, can use the system with ease and can produce what we refer to as AR-Ready Book.This book can be used later by the teacher or even by the students themselves.Figure 2 shows a picture of "Sandy Starfish" interactive book created by this work along with the different tags used.
The basic operation of TAGtech can be illustrated in the following simple steps:

Creating an Augmented Reality-Ready Book
This is an important initial step in which the teacher first chooses the book or story and highlights the portions of the book that may require further illustrations, using 2-D, 3-D, sound, or video (MM-Objects).The next step is to choose required multimedia files from existing sources or from the web.Once images and videos are selected and stored locally, the teacher runs the TAGtech Tag creation software.This portion of TAGtech is user friendly and easy to operate.All what the teacher has to do is to select unused tags, upload selected multimedia files and connect each tag with each file.The Tags are then printed, cut and inserted in proper locations in the book.The AR-Ready Book is now ready for use.

Running the system
Once the AR-Ready book has been created, the operation of the system is straight forward.The teacher must first ensure that a proper setup is completed consisting of a PC or a laptop, a webcam connected to the PC and a Data Show projector (optional).
The book can be located anywhere and, in any orientation, as long as the pages of the book are within the range of the webcam.As the teacher reads the storybook, if the webcam detects a tag in any place within the book, the multimedia file is activated.The book can be moved or rotated as the teacher wishes without compromising the appearance of the images.

System's Architecture
As shown in Fig. 3, TAGtech components are built on top of three layers.The bottom layer is the Microsoft .Net Framework, followed by Aforg and Graft Library and finally the Microsoft XNA for 3-D graphics.The system consists of three major components as follows: Glyph Recognition, Glyph Creation and Projection.

Glyph Recognition Components
According to proposed Architecture glyph recognition is composed of the following Components:

Glyph Recognizer
The purpose of this component is to find all quadrilateral areas, colors or image feature and extract it from incoming camera video stream.

Glyph Database
The Glyph database consists of three tables containing the tags (Glyphs) and their associated binary matrix (0's and 1's), multimedia objects information linked to tag table and linked to physical multi-media files existing in the repository and finally a table containing the type of the multi-media object for later use in displaying or activating the object.Figure 4 shows a schema diagram for the glyph database.It is important to note that all the tags stored in the database have a "rotation variant property" to allow the system to recognize the tags scanned by the web camera even if they were tilted or rotated.This process is further explained in later sections.

Glyph Tracker
A glyph tracker is a class that is used for 3-D Augmented Reality which may require 3-D Pose estimation.A glyph tracker performs glyphs tracking, reducing shaking of glyphs and estimating their 3D pose in the real world.

Glyph Creator Component
This component implements transitive closure NN square matrix that represents the tag in a chess board style.The glyph creator generates new glyphs for the user, allows the user to upload the 2-D or 3-D object, connects the glyph with the uploaded object and finally sends the square matrix (tag) to the Glyph recognizer.

Projection Layout Component
The purpose of the Projection Layout component is to view the 2-D or 3-D images on top of the Glyph (tag).This component relies on a well-known algorithm called POSIT described in (Dementhon and Davis, 1995) and consists of the following subcomponents:

Graphics Device Services
This Service contains classes to use the graphics device to load and render resources and to apply effects to vertices and pixels.

Multimedia Collection
This component contains the prepared models represented as List of Multi-Media Objects 2-D, 3-D, voice, video and other Objects.

Service Container
This part is responsible for viewing the 3-D and 2-D objects inside that traditional book.

System's Procedures for the Implementation of Virtual Reality Objects
Virtual reality in TAGtech, is implemented using existing procedures available in GRAFT Framework.GRAFT stands for Glyph Recognition and Tracking Framework.The project provides a library which localize, recognize and pose estimation of optical glyphs in still images and video streams and files.All the Image processing algorithms implemented in GRAFT are based on Aforg.NET framework which is a C# framework.This framework is designed and can be used by researchers and developers in different fields such as: Computer Vision and Artificial Intelligence, Genetic Algorithms, Image Processing, Machine Learning, Neural Networks, Robotics, etc.
The main process of TAGtech consists of three main procedures; Glyph Creation, Glyph Recognition and Projection Layout.
A web camera scans the Glyphs and convert them into 2-D and 3-D images, or to other actions such as playing a video or sounds.This process requires several steps involving several functions available in the GRAFT Library.Computer vision algorithm finds Glyphs in a video stream and substitutes with artificially generated objects creating a view which is half real and half virtual -virtual objects in a real world.The glyph recognition process consists of several steps as explained below:

Glyph Recognition Procedures
Glyph recognition consists of a three-phase process: Glyph Extraction, Shape Recognition and Retrieving MM-Objects from Database.These procedures are described in the sections below and are depicted in Fig. 5.

TAG (Glyph) Extraction Procedures
The purpose of this step is to find all quadrants, colors or image feature and extract them from the incoming video stream.Figure 6 and 7 are samples of some glyphs represented by an equal square grid with the same number of rows and columns.These cells are filled either in black or white color except for the first and the last row/column of each glyph as they contain only black cells, which formed around each glyph black border.Also, assuming that each row and column contains at least one white cell, which means that there are no rows and columns completely black (except the first and last).So, if all of these glyphs are printed on white paper, it will form a white area around black glyph boundaries.The details procedures involved are as follows.

a. Convert Multi-Layer Image to Single Layer
Since the TAGS or glyphs are black or white, the first step is to remove colors by converting images to a single layer grayscale image, in an effort to reduce the amount of data to be handled in the next steps.

b. Gray Level Image to a Binary Image
In this step, a search and analysis for black quadrilaterals surrounded by white areas is accomplished by performing Otsu (Liao et al., 1999) thresholding and then blob analysis.

c. TAG Quadrilaterals Detection Using Difference Edge Detector
To avoid erroneous results due to problems with illumination and lighting, where black edges may appear lighter or white edges may appear darker, a difference edge detector is used, where a standalone blob forming quadrilateral is resulted, however, If the lighting condition is not quite bad, all these glyphs' quadrilaterals have a well-connected edge represented as a single blob, so by using a blob counting algorithm we can easily extract them.

d. Find and Handle all Standalone Blobs
In this step, all resulted blobs are examined to make sure they represent edges of a TAG.Only blobs that form quadrilateral looking object are selected, since a TAG will always be represented by a quadrilateral regardless of the way it is rotated.For a quadrilateral to be considered a TAG it must be dark from the inside and white from the outside

Shape Recognition Procedures
Each potential Tag is represented by a square image containing the Tag data.TAGTECH implements quadrilateral transformation algorithm to transform any quadrilateral from a given source image to a rectangular image.This algorithm is based on homogeneous transformation.Its math is described by (Heckbert, 1999).
The Tag is originally represented by an N  N square grid with cells containing black or white cells.Now the easiest method to recognize the tag is to divide the extracted image into an N  N array equal in size to the original Tag.A binary N  N array is also created and filled with either "0" for black color or "1" for white color.However, this process is not so straight forward, due to imperfections in the color of the cells.This problem is solved by concentrating on the center of each cell while discarding pixels located at the boundaries as shown in Fig. 8. White pixels are then counted in every cell (area which is highlighted with light gray lines) and if the number exceeds 60% of the total pixels in the cell, it is concluded that it is a white cell.A value of "1" is inserted in the binary array in the adjacent cell.

Conclusion
This paper presented a learning system that is based on using augmented reality.The main advantage of this system is the low-cost as it relies on using existing devices such a web camera, laptop and data show projector.The system was implemented using existing frameworks including Aforge.net.GRAFT and Microsoft XNA for 3D Graphic Rendering.The project resulted in an e-book which based on a well-known story "Sandy the Goldfish".
As it was already mentioned, the glyph recognition in image or video frame is not as complex as its localization process.It is the hardest part to find a quadrilateral, which may contain a glyph.We found that edge detection approach performed extremely well and is much better than other techniques given appropriate lighting conditions.Future work aims to enhance the process of searching and retrieving multimedia objects from internet repositories using ontology-based approach to and to improve storing and retrieving mechanisms.Object classification using data mining techniques, namely convolutional neural networks, will be applied to recognize required objects.0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0

Fig. 10 :
Fig. 10: A two-dimensional array corresponding to black and white cells of a glyph's image