A worked example: single name fold

To illustrate how CSN operates, here we step through an example of searching within the ACM Digital Library for papers by the author Stefan M. Rüger using the folding operation to merge the two different occurrences of the name that appear. If you already have CSN setup in your browser, you can visit the live result of searching the ACM digital library for this name here and perform the transformations yourself.

Figure 1: Searching for Rüger, Stefan in the ACM Digital Library: initial screen.

Click here for hi-res PDF version of figure.

Figure 1 (above) shows the initial screen encountered by the user, with Figure 2 (below) an enlarged version of the Refine by People box from the left-hand side of the page that shows the sequence of transforms (in this walkthrough) the user makes to this area using the CSN toolbar.

Figure 2a shows the initial view of the Refine by People box produced by the ACM digital library. Looking through this list of names, Rüger is listed twice: once without his middle name, and then (four lines later) with it.

Figure 2: The different stages of transforming Rüger, Stefan M. (a) the initial list of names presented (b) dragging a selected region (c) folding the selected region by firstname (d) merging identical results.

Click here for hi-res PDF version of sub figure (a).
Click here for hi-res PDF version of sub figure (b).
Click here for hi-res PDF version of sub figure (c).
Click here for hi-res PDF version of sub figure (d).

Figure 3 (below) shows the result of clicking on the Firstname action in the CSN toolbar, where the toolbar has expanded downwards to provide a selection of options available within the Firstname action. The instruction immediately below the main heading directs the user to click and drag out an area of interest within the main web page (when the time comes in our worked example, this will be the boxed list of names previously shown in Figure 2a); the next two items control how the names will be changed: the tick next to the item "Firstname" means the whole first name will be left when the change occurs (but any middle names will be removed); changing the tick to be "Only initial" means even the first name will be reduced—down to its initial letter.

Figure 3: The options available for the Firstname action.

Click here for hi-res PDF version of the figure.

Choosing to work with the "Firstname" transformation, in Figure 2b the user has started to drag out an area of interest. Initially moving the mouse cursor around selects individual elements within the page, such as the line Rüger, Stefan (58). Clicking on this item, and then dragging down causes the selected rectangular area to expand in size. Equally, moving the mouse cursor back up reduces the size of the box again. At this stage of the interaction CSN captures all mouse clicking events, preventing them from propagating any further to elements in the web page. This is so clicking on an item that is hyperlinked, for example, will not cause the browser to navigate away from the current page. When no item from the CSN toolbar is active, then user interaction with hyperlinks on the page proceeds as normal.

When the user releases the mouse from a dragging operation, the selected action (Firstname folding in this case) is applied, and any items that are now identical in name are moved next to one another, and the CSN toolbar returns to an inactive state. The merging of identical items does not occur at this point as there are cases where it makes sense to apply further transformation. In Figure 2c we can see the result of this applied to the first three items of the author list in the ACM digital library. Note that the entry Song, Dawai (15) has been unaffected by the procedure, it happened to be between the two values we were interested in, and has now moved to be after them.

Figure 4: Options for merging.

Click here for hi-res PDF version of figure.

For the sake of simplicity in this worked example we will not consider any more advanced functionality and move straight on to merging the items. In Figure 4 (above) we see the options that result from clicking on the Merge action. Again it is possible to interactively select the area of the web page we wish identical, adjacent items to be merged in. There is also a Previous region item which, as the name indicates, means the region from the previously selected action will be used. Clicking this results in Figure 3d. The two versions of Rüger's name have indeed been merged, showing a count of 58 + 8 = 66 matches.

At any stage of this sequence of changes—by way of explanation to the user as to what has happened—hovering over an item that has been changed by CSN brings up a tooltip that captures the history of changes. If the element from the original page already has a tooltip, then the CSN information is appended to it.

Figure 5: Searching for the merged result of Rüger, Stefan and Rüger, Stefan M. in the ACM Digital Library.

Click here for hi-res PDF version of figure.

Clicking on the newly formed link allows the user to see the result of these merged items (Figure 5 [above]). The two result sets corresponding to the searches for the two variants of the name Rüger, Stefan (one with a middle name, the other without) in the ACM digital library are brought up simultaneously, and shown side by side. While it would be highly desirable to render the search results as a single list, there are a variety of issues that make this difficult to achieve reliably across a wide range of digital library systems (or even semi-reliably!). In practice, we have found this side-by-side technique works out quite well.

In terms of the side-by-side frames approach used in CSN, one advantage this has is that it works independently of the digital library system used. Furthermore, to compensate for the lack of a single unifying list, some care has been taken over the formation of the elements that constitute the side-by-side frames. For instance, neither frame is permitted to include a vertical scrollbar. Instead the outer page will add in a scrollbar if needed, and avoids the known user confusion caused by having inner scrollbars within a larger region which may in some circumstances include its own scrollbar. The approach of side-by-side positioning also plays to the trend in monitor display technology development which is for the devices to be increasingly wider.

For our example (Figure 5) it is indeed the case that there is a scrollbar located on the right-most side. This is because the search results within the frames exceed the height of the browser's page display area. Changing the position of this scrollbar moves the view of the search results shown within the two frames in unison.

Other functionality

Other folding functionality provided by CSN includes removing accents and punctuation from names, and reducing a name to using just the initial letter of their firstname.

Name authority. The Name authority action makes use of OCLC's on-line Virtual International Authority File web service accessable through viaf.org. This service allows you to perform approximate name searches, amgonst other things, which CSN exploits to to provide a list of possible alternative names to a given particular name on a web page. To use, click on the Name authority action, then hover over the name of interest. After a short pause a popup window appears containing any relevant names CSN has been able to locate. Clicking on one of the names in the popup list replaces the web page version with the one you have selected.

Crowdsourcing. CSN also monitors what transformations are being made using the toolbar, across all the people using CSN, which it store in a central repository. When hovering over a name, if it finds a match in the central repository it presents summary summary statistics of the changes made in a popup window: for example, Stefan M. Rüger has been changed 15 times to Stefan M. Rueger and 25 times to S. Rüger. As with the Name authority action, clicking on one of the alternative replaces the name as it appears in the web page being displayed by the browser.

Expand. At the right-most end of the CSN toolbar the user can control the mode of operation to be either fold or expand. Now when the user selects an action it replaces the highlighted text with an expanded version. In this mode, rather than apply CSN to faceted lists of author names it makes the most sense to apply the action to a text query box where the user has already entered some information, although this is not a constraint—if there are other HTML elements where it makes sense to do this, then they are able to do so.

To take a more concrete example, say a user has entered the query Witten, Ian, into the search page to the IEEE Xplore digital libary, but not yet pressed the search button. Instead, as their next step they click on Name authority from the CSN toolbar (with the radio button in the toolbar set to expand), which (like the folding mode) then invites the user to select an area of text to transform. Selecting the text in this query box (Witten, Ian) results in the query term beening expanded to include the VIAF known variants Witten, I. H. and Witten, Ian H. Pressing the search button now, the user initiates a broader (more inclusive) query for articles by this author compared with their original query.