Subramanian, Hariharan and Shankar, Priti (2006) Compressing XML Documents Using Recursive Finite State Automata. In: Lecture Notes in Computer Science, 3845 . pp. 282-293.
Restricted to Registered users only
Download (391Kb) | Request a copy
We propose a scheme for automatically generating compressors for XML documents from Document Type Definition(DTD) specifications. Our algorithm is a lossless adaptive algorithm where the model used for compression and decompression is generated automatically from the DTD, and is used in conjunction with an arithmetic compressor to produce a compressed version of the document. The structure of the model mirrors the syntactic specification of the document. Our compression scheme is on-line, that is, it can compress the document as it is being read. We have implemented the compressor generator, and provide the results of experiments on some large XML databases whose DTD’s are specified. We note that the average compression is better than that of XMLPPM, the only other on-line tool we are aware of. The tool is able to compress massive documents where XMLPPM failed to work as it ran out of memory. We believe the main appeal of this technique is the fact that the underlying model is so simple and yet so effective.
|Item Type:||Journal Article|
|Additional Information:||Copyright of this article belongs to Springer.|
|Department/Centre:||Division of Electrical Sciences > Computer Science & Automation (Formerly, School of Automation)|
|Date Deposited:||16 May 2006|
|Last Modified:||19 Sep 2010 04:26|
Actions (login required)