IDENTIFYING CLONED NAVIGATIONAL PATTERNS IN WEB APPLICATIONS
Keywords:
Clone Analysis, Cloned Patterns, Navigational Patterns, Web Application ReengineeringAbstract
Web Applications are subject to continuous and rapid evolution. Often programmers indiscriminately duplicate Web pages without considering systematic development and maintenance methods. This practice creates code clones that make Web Applications hard to maintain and reuse. We present an approach to identify duplicated functionalities in Web Applications through cloned navigational pattern analysis. Cloned patterns can be generalized in a reengineering process, thus to simplify the structure and future maintenance of the Web Applications. The proposed method first identifies pairs of cloned pages by analyzing similarity at structure, content, and scripting code. Two pages are considered clones if their similarity is greater than a given threshold. Cloned pages are then grouped into clusters and the links connecting pages of two clusters are grouped too. An interconnection metric has been defined on the links between two clusters to express the effort required to reengineer them as well as to select the patterns of interest. To further reduce the comprehension effort, we filter out links and nodes of the clustered navigational schema that do not contribute to the identification of cloned navigational patterns. A tool supporting the proposed approach has been developed and validated in a case study.
Downloads
References
Anquetil N. and Lethbridge T. C. Experiments with clustering as a software remodularization
method. In Proc. of the 6th Working Conference on Reverse Engineering, Atlanta, Georgia, USA,
October1999. IEEE Computer Society, pp. 235–255.
Antoniol G., Canfora G., Casazza G., and De Lucia A. Web Site Reengineering using RMM. In
Proc. of International Workshop on Web Site Evolution, Zurich, Switzerland, 2000, pp. 9-16.
Aversano L., Canfora G., De Lucia A., and Gallucci P. Web Site Reuse: Cloning and Adapting. In
Proc. of 3rd IEEE International Workshop on Web Site Evolution, Florence, Italy IEEE CS Press,
, pp.107-111.
Balazinska M., Merlo E., Dangenais M., Lague B., and Kontogiannis K. Measuring Clone Based
Reengineering Opportunities. In Proc. of the 6th IEEE International Symposium on Software
Metrics, 1999, Boca Raton, Florida, IEEE CS Press, pp.292-303.
Baker B. S. On Finding Duplication and Near Duplication in Large Software Systems. In Proc. of
the 2nd IEEE Working Conference on Reverse Engineering, Toronto, Canada, IEEE CS Press,
, pp 86-95.
Baresi L., Garzotto F., and Paolini P. Extending UML for Modeling Web Applications. In Proc. of
th Annual Hawaii International Conference on System Sciences (HICSS-34), IEEE CS Press,
, pp. 1-10.
Baxter D., Yahin A., Moura L., Sant’Anna M., and Bier L. Clone Detection Using Abstract
Syntax Trees. In Proc. of IEEE International Conference on Software Maintenance, Bethesda,
Maryland, USA, IEEE CS Press, 1998, pp. 368-377.
Bieber M. and Isakowitz T. (guest editors), Special issue on Designing Hypermedia Applications,
Communications of the ACM, vol. 38, no. 8, 1995.
Boldyreff C., Munro M., and Warren P. The evolution of websites. In Proc. of 7th IEEE
International Workshop on Program Comprehension, Pittsburgh, Pennsylvania, USA, IEEE CS
Press, 1999, pp. 178–185.
Boldyreff C. and Kewish R. Reverse Engineering to Achieve Maintainable WWW Sites. In Proc.
of 8th IEEE Working Conference on Reverse Engineering, Suttgart, Germany, IEEE CS Press,
, pp. 249 – 257.
Calefato F., Lanubile F., and Mallardo T. Function Clone Detection in Web Applications: A
Semiautomated Approach. In International Journal of Web Engineering, vol.3, no.1, May 2004,
pp. 3-21.
Ceri S., Fraternali P., Bongio A. Web Modeling Language (WebML): a modeling language for
designing Web sites. In Computer Networks, 9th World Wide Web Conference, vol. 33, 2000,
pp. 137 – 157.
Conallen J. Building Web application with UML. Addison Wesley, 2000.
Cormen T. H., Leiserson C. E., and Rivest R. L. Introductions to Algorithms, MIT Press, 1990.
Costagliola G., Ferrucci F., and Francese R. Web Engineering: Models and Methodologies for the
Design of Hypermedia Applications. In Handbook of Software Engineering and Knowledge
Engineering, S.K. Chang (editor), World Scientific Publishing Co., pp. 181- 199.
Di Lucca G. A., Fasolino A. R., and Tramontana P. Reverse engineering Web applications: the
WARE approach. In Journal of Software Maintenance and Evolution: Research and Practice, vol.
, no. 1-2, 2004, pp. 71-101.
Di Lucca G. A., Di Penta M., and Fasolino A. R. An Approach to Identify Duplicated Web Pages.
In Proc. of 26th IEEE Annual International Computer Software and Application Conference,
Oxford, UK, IEEE CS Press, 2002, pp. 481-486.
Di Lucca G. A., Fasolino A.., De Carlini U., and Tramontana P. Abstracting Business Level UML
Diagrams from Web Applications. In Proc. of 5th IEEE International Workshop on Web Site
Evolution, Amsterdam, The Netherlands, IEEE CS Press, 2003, pp. 12-19.
Di Lucca G. A., Fasolino A. R., De Carlini U., Pace F., and Tramontana P., Comprehending web
applications by a clustering based approach. In Proc. of the 10th International Workshop on
Program Comprehension, Paris, France, IEEE CS Press, 2002, pp 261-270.
Eichmann D. Evolving an Engineered Web. In Proc. International Workshop Web Site Evolution,
Atlanta, GA, 1999, pp 12-16.
Ginige A. and S. Murugesan (guest editors), Special issue on Web Engineering, IEEE Multimedia,
vol. 8, no. 1-2, 2001.
Girardi C., Pianta E., Ricca F., and Tonella P. Restructuring Multilingual Web Sites. In Proc. of
th IEEE International Workshop on Web Site Evolution, Montreal, Canada, 2002, IEEE CS
Press, pp. 290-299.
Hainaut J. L., Chandelon M., Tonneau C., and Joris M. Contribution to a Theory of Database
Reverse Engineering. In Proc. of the 1st IEEE Working Conference on Reverse Engineering,
Baltimore, MA, USA, IEEE CS Press, 1993, pp. 161-170.
Higo Y., Ueda T., Kamiya Y., Kusumoto S., and Inoue K. On software maintenance process
improvement based on code clone analysis. In Proc. of the 4th International Conference on
Product Focused Software Process Improvement, 2002, Rovaniemi, Finland, pp 185-197.
Isakowitz T., Stohr E. A., and Balasubramanian P. RMM: a Methodology for Structured
Hypermedia Design. In Communications of the ACM, vol. 38, no. 8, 1995, pp. 34–44.
Isakowitz T., Kamis A., and Koufaris M. Extending the Capabilities of RMM: Russian Dolls and
Hypertext. In Proc. of 30th Hawaii International Conference on System Science, Maui, Hawaii,
USA, IEEE CS Press, 1997, pp. 177-186.
Kamiya T., Kusumoto S., and Inoue K. CCFinder: A Multilinguistic Token-Based Code Clone
Detection System for Large Scale Source Code. In IEEE Transactions on Software Engineering,
, vol. 28, no. 7.
Levenshtein V. L. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals.
Cybernetics and Control Theory, vol. 10, 1966, pp. 707-710.
Messmer B. T. and Bunke H. A New Algorithm for Error-Tolerant Subgraph Isomorphism
Detection. In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 5, n. 20, 1998,
pp. 493-503.
Ricca F. and Tonella P. Understanding and Restructuring Web Sites with ReWeb. In IEEE
Multimedia, vol. 8, no. 2, 2001, pp. 40-51.
Ricca F. and Tonella P. Analysis and Testing of Web Application. In Proc. of International
Conference on Software Engineering, Toronto, Ontario, Canada, 2001, pp. 25-34.
Ricca F. and Tonella P. Using Clustering to Support the Migration from Static to Dynamic Web
Pages. In Proc. of 11th IEEE International Workshop on Program Comprehension, Portland,
Oregon, 2003, pp. 207-216.
Ricca F., Tonella P., Girardi C., and Pianta E. An Empirical Study on Keyword-based Web Site
Clustering. In Proc of 12th International Workshop on Program Comprehension, Bari, Italy, IEEE
CS Press, 2004, pp 204-213.
Schwabe D. and Rossi G. Developing hypermedia applications using OOHDM. In Proceedings of
Workshop on Hypermedia development Process, Methods and Models, Hypertext 98, 1998.
Ullman J. R. An Algorithm for Subgraph Isomorphism. In Journal of the Association of Computer
Machinary, vol. 1, n. 23, 1976, pp. 31-42.
Wiggerts T. A. Using clustering algorithms in legacy systems remodularization. In Proc of 4th
Working Conference on Reverse Engineering, Amsterdam Netherlands, 1997, pp. 33-43.
Wong K. Toward Reusable and Evolvable Web Sites. In Proc. of 1st International Workshop on
Web Site Evolution, Atlanta, GA, USA, 1999.