Article : OpenCloud - Getting Started Guide

OpenCloud - Getting Started Guide

Introduction

OpenCloud is a Java library for generating tag clouds, also known as weighted list. The two main classes in the library are the Tag class that represents a single tag (basically a string with associated URL) and the Cloud class that represents the tag cloud in its entirety. The Cloud class behaves like a collection that you can populate by adding Tag objects.

Each tag has a score value that represents its level of importance. Tags with a higher score will be assigned a higher weight. When a tag is added to the Cloud object, if it's already present a tag with the same name, the two scores are summed. Since the default score is 1.0, if don't specify score values, the total score of a tag equals the number of times that it has been added to the Cloud (frequency of occurrence of the tag).

The Cloud class converts the scores to weight values using a linear equation. The user can choose the range of values that the weight can assume, so that weight values can be conveniently used for tag cloud visualization.

Quick start

You can create a simple tag cloud following these steps:

  1. Create a Cloud object and set its properties. One of the most common properties is the maximum weight value, that defines the range of weight values assigned to tags. It can be set to a convenient value, e.g. the maximum font size. For the minimum weight value can often be kept the default value of zero.

    Cloud cloud = new Cloud(); // create cloud cloud.setMaxWeight(38.0); // max font size

  2. Populate the tag cloud by creating Tag objects and adding them to the cloud. As said before, the Cloud object by default counts the number of times that a tag has been added, so that more frequent tags will have a higher score.

    Tag tag = new Tag("Google", "http://www.google.com"); // creates a tag cloud.addTag(tag); // adds it to the cloud

  3. Call the tags method of the Cloud class to obtain a list of the tags composing the tag cloud, each with its own weight assigned. Then cycle through the list and write the HTML code.

    <% for (Tag tag : cloud.tags()) { %> <% } %>

    In this example the getLink, getWeight and getName are used to compose the HTML link.

In the following sections the steps will be described in more detail.

Creating a tag cloud

To create a tag cloud you have to instantiate a Cloud object and set its properties. The most common properties are described below.

Choosing a weight range

The range of weight values can be defined with setMinWeight and setMaxWeight methods. The default value for the minimum weight is zero.

cloud.setMaxWeight(38.0); // weight values will range between 0.0 and 38.0

Setting a default link

Most of the times tag URLs share the same structure and differ only for the tag name. In these cases a default URL can be set. It consists of a format string with one string parameter, indicated with the %s format specifier. The parameter is substituted with the tag name.

cloud.setDefaultLink("http://www.flickr.com/photos/tags/%s/");

The default link is used whenever a Tag has a null link, otherwise the link associated with the Tag has the precedence.

Setting the number of tags to display

Using the setMaxTagsToDisplay method you can specify the maximum number of tags composing the tag cloud.

cloud.setMaxTagsToDisplay(50); // the displayed tag cloud will be composed by at most 50 tags

Setting the tag case

Using the setTagCase method you can specify how to handle the case of the tag names. The possible options are:

  • LOWER: tags names are converted to lower case.
  • UPPER: tags names are converted to upper case.
  • CAPITALIZATION: tags names are capitalized (first letter upper case, other letters lower case).
  • PRESERVE_CASE: tags names are not modified and they are case insensitive (e.g. "Home" and "home" are considered the same tag).
  • CASE_SENSITIVE: tags names are not modified and they are case sensitive (e.g. "Home" and "home" are considered different tags).

When the PRESERVE_CASE is specified, the case of the last entered tag is used. To keep the Cloud behavior consistent you should set the case when you instantiate the object, before adding any tag.

Populating the tag cloud

Once you have created and customized the Cloud object you can start inserting tags.

Creating tags

A Tag object has four main properties: name, link (URL), score and creation date. All four parameters are optional when constructing an object. By default name and link are null, the score is 1.0 and the creation date is equal to the current time.

// some constructors Tag tag; tag = new Tag(); // default constructor tag = new Tag("test"); // name tag = new Tag("test", "http://www.google.com/search?q=test"); // name and link tag = new Tag("test", 3.5); // name and score tag = new Tag("test", "http://www.google.com/search?q=test", 3.5); // name, link and score

Adding tags

To add a tag to the tag cloud, you can create a Tag object and insert it in the Cloud object using the addTag method. If a tag with the same name is already present, the scores of the two tags are summed.

cloud.addTag(new Tag("art", 2.5));

You can add more than one tag at once using the addTags method that accepts a collection of Tag objects.

Tag extraction from text

Another way of adding tags is by passing a text string to the addText method. It extracts words from the text and adds a Tag to the Cloud object for each word identified. The sequences of characters considered as words are those that match a predefined regular expression, that can be changed through the setWordPattern method.

The URL of the tags are composed using the default link that can be set through the setDefaultLink method.

Tag filtering

If you want to exclude tags with certain characteristics from the tag cloud you can use filters. For example you may want to ignore words that are too short or too long or that are present in a black list.

There are two types of filter: input filters and output filters. If a tag doesn't pass an input filter it will not be present in the Cloud object. If a tag doesn't pass an output filter it will be present in the Cloud object but it won't be showed in the final tag cloud, i.e. it will not be returned by the tags method. Output filters are useful when the filter parameters can change over time. For example if a term is filtered by an output filter because is present in a black list and at a given moment the term is removed form the black list, the tag cloud content will change dynamically and the term will be shown.

Displaying the tag cloud

The following sections describe how to obtain the output tag cloud and display it in JSP page using HTML code.

Getting tags to display

To get the tags that compose the output tag cloud call the tags method of the Cloud object. The method assigns a weight to each tag based on the tag score and returns a List of Tag objects that represents the final tag cloud.

HTML / CSS generation

To display a tag cloud in a JSP page you can iterate through the list of tags returned from the tags method and generate the HTML code fragment associated with each tag. For example:

<% // cycle through output tags for (Tag tag : cloud.tags()) { // add an HTML link for each tag %> <%= tag.getName() %> <% } %>

To obtain the level of importance of the tag within the tag cloud, you can call the Tag class getWeight and getWeightInt. The getWeightInt returns the weight value rounded to an integer.

Another way to generate the HTML code is by using the provided HTMLFormatter class.

Types of ordering

There are four predefined Comparator classes than can be used to specify the way tags are sorted: NameComparatorAsc, NameComparatorDesc, ScoreComparatorAsc, ScoreComparatordesc. You can specify a type of ordering passing an instance of one of these classes to the tags method. For example, to order the tags in descending order of score just call the tags method passing a ScoreComparatorDesc object. By default the tags returned by the tags method are sorted alphabetically, i.e. the NameComparatorAsc is used.

cloud.tags(); // by default tags are sorted alphabetically cloud.tags(new Tag.NameComparatorAsc()); // sorts alphabetically (equivalent to the previous instruction) cloud.tags(new Tag.ScoreComparatorDesc()); // sorts by score, in descending order

To order the tags by a custom criteria you need to create a class that implements the Comparator interface.

Thresholds

Thresholds can be set so that tags that don't reach a certain score will not be showed in the tag cloud. You can define a threshold for the score using the setThreshold method and you can also define a threshold for the normalized score using the setNormThreshold method. The normalized score is proportional to the score but ranges between 0.0 and 1.0 (the tags with the highest score will have a normalized score of 1.0).

Serialization

Sometimes it can be necessary to store a Cloud object in a file, in binary or XML format. OpenCloud by itself doesn't provide methods for serialization, but it can be performed through standard classes provided by the Java platform or through external libraries.

Binary serialization

Binary serialization of the Cloud object can be achieved through standard Java classes. For instance:

// Writes a Cloud object to a binary file FileOutputStream fos = new FileOutputStream("test_file.dat"); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(cloud1); oos.close(); // Reads back the Cloud from the binary file Cloud cloud2; FileInputStream fis = new FileInputStream("test_file.dat"); objectInputStream ois = new ObjectInputStream(fis); cloud2 = (Cloud) ois.readObject(); ois.close();

XML serialization

An easy way to serialize a Cloud object to XML is by using the XStream opensource library that can be downloaded from xstream.codehaus.org.

XStream xstream = new XStream(); // Converts a Cloud object to XML String xml = xstream.toXML(cloud1); // Recreates the Cloud object from XML Cloud cloud2 = (Cloud) xstream.fromXML(xml);