CS 175/286 - Homework 5

In this assignment, you will read the Festival object from the new XML at http://payments.cinequest.org/websales/feed.ashx?guid=70d8e056-fa45-4221-9cc7-b6dc88f62c98&showslist=true.

Make a directory cs175/hw05/src and inside, copy the edu/sjsu/cinequest/comm/cinequestitem directory from the Cinequest project, but remove the User class.

In CinequestItem, Festival, and Schedule, change Persistable to Serializable

From edu/sjsu/cinequest/comm, copy CharUtils, Platform, ConnectionHelper, MessageDigest, and Callback

From edu/sjsu/cinequest/comm/xmlparser, copy BasicHandler and FestivalParser

From the JavaCommonTest, copy the edu.sjsu.cinequest.javase package and FestivalParserTest. Copy everything into the src tree.

In JavaSEPlatform, remove the obsolete import to WebConnection

Make a Java SE (not Android) project with all this code, and run the unit test. It should pass.

The new feed is UTF-8. In BasicHandler, remove the call to fixWin1252. If you find any place where an input stream is set to ISO 8859-1, change that to UTF-8.

As you can see, the new feed generates a sequence of Show objects. Make classes Show, Showing, and Venue in edu.sjsu.cinequest.comm.cinequestitem, like this:

public class Show {
    public String id;
    public String name;
    public int duration;
    public String shortDescription;
    public String thumbImageURL;
    public String eventImageURL;
    public String infoLink;
    public Map<String, ArrayList<String>> customProperties = new HashMap<String, ArrayList<String>>();
    public List<Showing> currentShowings = new ArrayList<Showing>();

public class Showing {
    public String id;
    public String startDate;
    public String endDate;
    public String shortDescription;
    public Venue venue = new Venue();
public class Venue {
    public String id;
    public String name;
    public String address;

Don't worry about the public fields. It's ok. Really.

In FestivalParser, write a method parseShows that returns a List<Show> instead of a Festival. Throw away any code you no longer need. Translate the CustomProperty sequences into a Map<String, ArrayList<String>>. You can discard the Group, Sequence, and Hidden elements.

In your report.txt, explain the strategy that you use for parsing.

Write a unit test edu.sjsu.cinequest.comm.xmlparser.ListOfShowTest that tests your method. It should have at least 5 test cases.

Push your code to your repo.

Now comes the hard part. We need a Festival object, not a List<Show>. The people who designed this feed have never heard of normalization. Look at the venues. What one should do is have a list of venues, each with an ID, and then use venue IDs, so that the detail information isn't endlessly duplicated. You will need to remove duplicates.

Also, the Festival class is very careful to distinguish between a Film (an item with a title, director, etc.) which may be shown alone, or before a feature film, or as a part of a "Shorts" feature), a ProgramItem (a collection of films that may be shown one or more times) and a Schedule (a program item that is shown at a particular point in time). That subtlety was lost on the vendor whose feed we are consuming. Your thankless task is to recover the information needed by the Android app so that it doesn't have to change at all. Supply a method public static Festival parseFestival(String url, Callback callback) in the FestivalParser class where you first call the parseShows method and then do the cleanup.

Make a Program Item out of each Show with at least one CurrentShowing. (You can recognize short films by the fact that they have an empty currentShowings list.)

Make a Film out of each Show whose Type of Film CustomProperty does not contain Shorts Program, and add that film to the matching ProgramItem.

For each Show with an empty currentShowings list, find the ProgramItem in which the Film occurs. Look at the description. It follows the regex

(Part of|Plays (with|before) the feature film) ([^.]+)\..*

Extract the match of the third group, and that is the title of the program item to which the film belongs. Add it to its list of films. (There are a few short films where the title doesn't follow that regex, or where there is no matching program item. Skip them.)

Each CurrentShowing corresponds to a Schedule element. Link the schedule with its program item. Also, note that each film has a list of schedules attached to it. (That is to save an expensive lookup when displaying a film.) Populate those lists with references to the same schedule as the one in the schedules vector of the Festival object. (I.e. Don't make new objects.)

Populate the Festival list of VenueLocation with unique venues. (There are ten.) In the old feed, there were venue abbreviations (C12, CAL, REP, etc.) We don't have them here. Synthesize them by removing anything that is not an uppercase letter or number. That gives you abbreviations such as C12S7 etc. Note that each Schedule has such an abbreviation and not a reference to the VenueLocation.

For the IDs of the various items, use the IDs that you find in the feed.

In your report.txt, explain your strategies, and which issues you found most annoying.

Add five test cases to the FestivalParserTest object. Hint: In your test cases, use the getProgramItemForId and getFilmForId methods of the Festival class so you can get at the interesting objects easily.

Be sure to push your repo again.