CS 351: Design of Large Programs -------------------------------- Assignment # 4, Due Date 04/19/02 ---------------------------------- The Extended Markup Language (XML) is a facility that enables communities of users (for example business, pharmaceutical, etc.) to define their own tags for inclusion in documents to be exchanged among them. As in HTML, tags delineate the start and end of elements in a document. Unlike HTML, however, end tags are mandatory and element/attribute names are case-sensitive. The correct sequencing, nesting, number of successive occurences of elements (zero, one or more) and attributes are specified in a Document Type Definition (DTD). (There is a far more expressive way of specifying the contents of a document using W3C's XML Schema Standard but for this assignment we use the DTD.) An example DTD for purchase orders and a corresponding instance document is included at the end of this assignment. The Big Picture --------------- A head office receives an XML document of purchase orders from a sales office every few days. The document should be validated and then saved. Received orders are maintained (without change of format for simplicity) in files named by month, i.e., all orders placed in the month of April 2002 are stored in file po02-04 as an XML document. In addition, XML documents for the following purposes should be created. (i) The marketing department wishes to receive an XML file containing names of all customers who have shopped within a given period. In addition, the total volume of purchases made by each customer during that period should be included. The entries should appear in decreasing order of total purchase volume. (The marketing department needs this information so they can target customers for promoting a new product.) The DTD for this document appears below. ]> (ii) This part is for extra credit only. The suppliers to the store wish to receive the number of items of each product purchased (without regard to who has purchased what.) This information should be communicated every few days. (This information will be forwarded to the manufacturers so they can plan how much of what to manufacture in the next production cycle.) This too should be contained in an XML document whose type is defined in the DTD to be given in class. This semester, we will experiment with the Java Architecture for XML Binding (JAXB) which provides an API and tool that allows automatic two-way mapping between XML documents and Java objects. Please read the primer, sample code and Mike Dolan's handout before you proceed. The main ideas/steps are summarized below: (1) We create a minimal binding schema by saving the following in file orders.xjs: (2) Invoke the schema compiler with the command xjc and two command line parameters -- the DTD and the binding schema. This creates a Java class for each "complex" element in the DTD. A complex element is one that contains other elements or that has an attribute. Each class has accessor and mutator methods for handling its data members (which correspond to the sub-elements and attributes of the element.) Sub-elements that may occur more than once are mapped to a List of objects of type corresponding to that sublement. Each class also contains methods for unmarshalling an XML document (populating lists in instances of the above created classes using data in the XML document), marshalling (reverse of unmarshalling) and validating an object tree. For the orders.dtd, the schema compiler should generate a total of nine classes Orders.java Order.java Customer.java Items.java Item.java Payment.java Address.java Cash.java Credit.java Note the relation between element and class names. Inspect the source files of these classes and note the convention for naming of the accessor and mutator methods since you will need to use them. (3) Now we need to write the Java application program. Part 1 ------ Your program should read a total of 10 XML files containing purchase orders. Each file will typically contain orders placed within a week (but the week itself may straddle two successive months.) Orders read should be saved in the appropriate file, i.e. if an input file has weekly orders some of which were placed in end-april02 and some in early-may02, then the former should be saved in file po02-04 and the latter in file po02-05. For simplicity assume that the orders within a file are in temporal order (increasing order of date.) Part 2 ------ Upon completion of Part 1, your program should display: All 10 XML files read, unmarshalled and saved. Enter period of interest to the Marketing Dept.: At this point the user is expected to enter two dates in the format yy-mm-dd 02-01-16 02-02-15 This means that the total volume of purchases made by each customer during that period (between Jan 16, 2002 and Feb 15, 2002 inclusive) should be computed and saved in file "mark". The root element for this document is customers and its DTD appears above. Part 3 (Extra Credit Part) Details later. ------ ****************************************************************** The POs that comprise the document your program will receive are defined by the DTD (Document Type Definition) below. ]> ****************************************************************** A sample XML document of POs is shown below. 10/22/2001 John
1010 Main Albuquerque 87131 1234567 2347654
shoes 5.00 1 5.00 sugar 15 2.00 2 4.00 9.00 10.00 1.00
10/24/2001 Mary
2643 Bronze New York 88131 2345678 5678234 4765411
ink 1.00 5 5.00 5.00 VISA Mary 1234567890 07/03
10/24/2001 Steve
7432 Silver Greenbelt 89131 2345678 4320659
tooth paste 2.50 2 5.00 tooth brush 1.00 2 2.00 7.00 10.00 3.00
10/25/2001 John
5103 Gold Round Rock 90131 1234567
TV set 10 150.00 1 150.00 150.00 150.00 0.00
******************************************************************