DISTRIBUTED IMAGE PROCESSING APPLICATION CONSIDERING CORBA AND XML TECHNOLOGY
Abstract: This paper will present a distributed image processing application that has the objective to create an efficient and powerful instrument for image analysis in varied domains like medical applications, industrial applications and others. For the implementation we have used the CORBA technology and as programming language we have used C++ and Python. The information (texts, images, sounds) is stored in XML files.
Key words: CORBA, XML, multimedia, distributed application
This paper will present a distributed image processing application, which has the objective to create an efficient and powerful instrument for image analysis in varied domains like medical applications, industrial applications and others.
Most of the image processing tasks require high memory availability or processing power. This paves the needs for finding alternatives to do such tasks. This becomes even crucial when we do not want to invest on many expensive hardware or processing devices. Distributed systems are powerful mechanism to process
heavy task. Various technologies are available to enable distributed processing. They are Java RMI (Remote Method Invocation), CORBA (Common Object Request Broker Architecture), DCOM (Distributed Component), RPC (Remote Procedure Calls), SOAP (Simple Object Access Protocol), PVM (Parallel Virtual Machines), MPI (Message Passing Interface) and more recently grid computing .
This project investigates the design and implementation of a distributed Image Processing System which has a Client-Server architecture, using exclusively open standards and free tools. For the implementation we have used the CORBA technology and as programming language we have used C++ and Python. The data (texts, images, sounds) is stored in XML files.
This application  is Open Source software and is registered by SourceForge.net, which provides free hosting to Open Source software development projects. The concept of Open Source promotes the benefits of collaborative development by ensuring that potential end-users are able to obtain and use software, and that the software may be improved and expanded to meet the needs of its users. Collaboration within the Open Source community (developers and end-users) promotes a higher standard of quality, and helps to ensure the long-term viability of both data and applications. 
II. THE CORBA TECHNOLOGY
Common Object Request Broker Architecture (CORBA) is a standard for distributed computing and systems integration. It allows lightweight objects to be accessed from anywhere in a network without concern for the operating system they are running on or the programming language they are written in. This facility is useful for designers and implementers of systems when a mix of technologies must be used, or for client/server systems that use multiple machines. CORBA can be viewed as a systems integration facility or as a layer that makes distributed software systems easier to write.
The members of the Object Management Group (OMG) define the CORBA standard. At its core, it defines the facilities required to allow a client machine to invoke the facilities offered by a software component running on any machine in a heterogeneous distributed system. Remote components are available across many boundaries, including
- different operating systems, such as different versions of UNIX, Windows, MVS, VMS, MacOS, OS/2, and real-time systems such as VxWorks
- Different programming languages, such as C++, Smalltalk, Ada, C, Python and Java, with many others to follow
The components of a CORBA system are objects. Each object has an interface that defines the services it offers to its clients, and this interface is defined in an interface definition language (IDL) specified by the CORBA standard. IDL is not a programming language and it does not replace the use of languages such as C++, Python, Java, C, and Smalltalk. Instead, IDL's only role is to define interfaces. Each interface defines the operations and attributes that are available to clients. The advantage of using IDL is that it allows a software component to define its interface in a language independent of the programming language used to code the component itself, or the language used to code the clients of the component. In particular, the language used to implement objects need not be the same as that used by clients, and of course, the clients that use a given object need not all is coded in the same language. A client program written in Python need not be aware that a CORBA object it is using is written in C++. In this case, the IDL definition of the interface of the object is translated automatically into Python for the benefit of the client, and into C++ for the benefit of the implementer of the interface. [4,5]
III. THE XML FACILITIES
The Extensible Markup Language (XML) is a simple, very flexible text format derived from the Standard Generalized Markup Language (SGML). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere.
XML has a few key advantages that make it the data language of choice on the Internet. These advantages were designed into XML from the beginning, and, in fact, are what made it so appealing to Internet developers:
- Application Neutrality: XML is both human- and machine-readable. An XML parser or processor usually processes an XML document, but if one is not available, an XML document can be easily read and parsed. Data kept in XML is not trapped within the constraints of one particular software application.
- Hierarchical Structure: XML is hierarchical, and allows the programmer to choose his own tag names. This is quite different from HTML. In XML, the programmer is free to create elements of any type, and stack other elements within those elements.
- Platform Neutrality: XML is cross-platform. This is mainly a feature of its text-based format. The use of certain text encoding ensures that there are no misconceptions among platforms as to the arrangement of an XML document. XML is designed for use in conjunction with existing Internet infrastructure using HTTP, SSL, and other messaging protocols as they evolve. These qualities make XML lend itself to distributed applications; it has been successfully used as a foundation for message queuing systems, instant messaging applications, and remote procedure call frameworks.
- International Language Support: all XML documents are Unicode, and they describe their own encoding in such a way that all XML processors are able to determine what encoding the document was written in. A few specific encoding must be recognized by all processors, so that it is always possible to generate XML that can be read anywhere and represent all of the world's characters. 
IV. CLIENT-SERVER APPLICATION USING CORBA AND XML
In order to obtain a flexible processing system which allows the integration of new components without concern for the operating system they are running on or the programming language they are written in, and to grow up the system performances through the spreading of the system components on different computers we have successfully used the CORBA technology.
The image processing application is divided in 2 parts and has a client-server architecture. The first part is the data processing part and is implemented in the server. Because the processed data are images, the processing tasks such as edge detection, image segmentation, pattern recognition need very much computing power. To achieve a good performance, the server is implemented in the C++ programming language and it should run on a machine with powerful hardware resources that is different than the client's machine that needs less resources. The second part is the presentation part and is implemented in the client using the Python programming language. Because all of the used technologies are open standards and the tools are cross-platform, the client and the server applications are platform-independent.
The Figure 1 presents the application's architecture in a heterogeneous environment.
The client program runs on a computer with no special hardware requirements, having the Windows operating system installed on. It is implemented in the very-high-level scripting programming language Python  that increases the speed of application development. The interaction with the user is realised using the wxPython graphical user interface. 
The server program which needs much computing-power is running on a high hardware-equipped computer, having the Linux operating system installed on. It is implemented in the C++ programming language that increases the speed of the processing. The image processing is realised using the Magick++ library, which is the object-oriented C++ API to the ImageMagick image-processing library, the most comprehensive open-source image-processing package available. 
The communication between the client and the server is realised using the CORBA technology. One of the free products that implement the CORBA specifications is omniORB. omniORBpy is an Object Request Broker (ORB) that implements the CORBA Python mapping. It works in conjunction with omniORB for C++. 
The processed data (texts, images, sounds) is encapsulated in XML files. XML is an excellent format for exchanging data between applications.To store the images and sounds, which are binary data, in an XML file, which is a text file, there are used the base64 or hex encoding. To parse this type of files, the client uses the PyXML library  and the server uses the Xerces-C++ library .
The data transfer between the client and the server is realised using the File Transfer Protocol (FTP). On the server machine is installed an FTP server , and the client program communicates with the FTP server using the Python's module ftplib.
For our experiments we have used two computers with their network adapters (10/100 MB/s) connected with a cross-over network cable. The client program is running on a computer with a Pentium 166Mhz processor, having 32Mb memory and the Windows 98 operating system installed on. The server program is running on a Pentium 1,4Ghz processor, having 256Mb memory and the Debian-Linux 3.0 operating system installed on.
We have measured the speed of the image processing considering the following two test scenarios:
a) the image processing is made local on the client machine and is embedded in the client program. The processing algorithms are implemented in the scripting programming language Python.
b) the image processing is made remotely on the server machine. The processing algorithms are implemented in the C++ programming language.
In both cases, the processing algorithms are exactly the same; only the implementation programming languages are different. The programs implemented in C++ are running very fast but they need more time for their development. The programs implemented in Python are slower than their C++ equivalent but the development process is much faster.
Our results are showing the time elapsed from the moment when the client starts the processing of the in-memory image until the moment when the client has available in memory the processed image. In order to reflect the system performances we have changed the resolution of the images to be processed.
The Figure 2 presents the actions that are executed in our two test cases and the Figure 3 presents the results of the measurements.
The images are stored in the client’s memory as bitmaps. In the first case, the original image, the image process and the processed image are all located in the client’s memory. In the second case, the original image (in memory) is serialised in an XML document (on disk) using the hex or base64 encoding. The XML document is than transferred on the server’s machine using the FTP protocol. The server parses the XML document and loads the image in memory than processes it and serialises the processed image in an XML document. The client receives this document using the FTP protocol and loads in memory the processed image from the XML document.
In our experiments we have considered an optimal network connection where the both client and server network cards (10/100 MB/s) were direct connected using a crossover cable. For the situation when the client is connected to the server through a slower network, the transfer of the multimedia files becomes a critical point in our system; the needed time for data transfer grows up, so the performance of the system goes down. To optimise the data transfer time, we can reduce the size of the transferred data by compressing the files.
The principal factor that increases the size of the transferred files is the conversion of the binary data (image, sound) in text (ASCII) format in order to be serialized in XML files. We have used two encoding algorithms, base64 and hex. In the worst case (hex encoding), the resulting ASCII-encoded representation is two times bigger than the binary data, but this negative effect of the encoding can be reduced using a proper compressing algorithm. For example, the Lempel-Ziv algorithm leads to a roughly 50% compression of ASCII-encoded data, so the encoding negative effect is almost annulated.
Another possibility to optimise the system is for the case when a succession of image transformations should be done for an image. Changing the application design, the transfer of unnecessary data can be avoided. For a succession of N processes, instead of having the following scenario:
where the intermediary images are transferred from client to server and server to client for every process, the application can be designed to run respecting the following scenario:
where the image is transferred only one time from client to server, at the beginning of the processes succession, and is received back from server after all processes are done.
Considering the CORBA and XML technology we have created an efficient distributed image processing system that can be used in varied domains like medical applications, industrial applications and others.
The CORBA technology allows the integration of new components, without concern for the operating system they are running on or the programming language they are written in. The XML technology is a good choice to store and transfer the information in a flexible, application and platform neutral way.
We have combined the very-high-level programming language Python for an easy and fast development of the system’s presentation layer with the powerful C++ programming language for the implementation of the image processing algorithms in order to achieve a faster processing layer.
To increase the performance the system’s components were spread on different computers with different hardware resources. The image processing part, which consumes much computing-power is running on a high hardware-equipped machine and the presentation part is running on a computer with no special hardware requirements.
 Fintan Bolton, “Pure CORBA”, Sams Publishing, 2002
 Jon Siegel, “Corba Fundamentals and Programming”, John Wiley & Sons, 1996
 Michi Henning, Steve Vinoski, “Advanced CORBA Programming with C++”, Addison-Wesley, 1999
 Christopher A. Jones, Fred L., Jr. Drake, “Python & XML”, O’Reilly & Associates, 2001
 R. Anderson, M. Birbeck, M. Kay, S. Livingstone, B. Loesgen, D. Martin, S. Mohr, N. Ozu, B. Peat, J. Pinnock, P. Stark, K. Williams, “Professional XML”, Wrox Press, 2000
Figure 3. The processing time for
different image resolutions for the local and remote processing case
Figure 3. The processing time for different image resolutions for the local and remote processing case