Integrated

Library Management Information System (LIS)

Spreadsheet Subsystem

 

To control all the activities of the system it is necessary to keep track of the users needs, tendencies in changing the nature of services etc. It is therefore very important to analyze statistics of requests, trends, scope of data collected. The problem is however that the needs for statistical analysis cannot be predefined once for ever. There are some standard statistics run once a month or once a year. But there are also cases that an extra statisctical research is needed. To this end a special Spreadsheet Subsystem has been designed and implemented which simplifies defining data analysis. The subsystem is integrated within the LIS system. The main goal of the system is to aggregate information contained in the LIS databases (mainly Catalogue, Loan and Acquisition databases) and process it to the form which is appropriate for the library staff.

Below we describe the main concepts of the subsystem and provide detailed presentation of its functioning.

 

1. Basic Concepts

The output of the system is table. The table's cells create rows and columns. To every row and column in the table we assign row query (or r-query) and column query (or c-query) respectively, which are standard ISIS queries (Table 1).

Table 1

 

c-query1

c-query2

c-query3

r-query1

cell11

cell12

cell13

r-query2

cell21

cell22

cell23

r-query3

cell31

cell32

cell33

 

To every cell we assign a cell query (qij), which is received by anding c-queryi with r-queryj, so,

 qij= c-queryi * r-queryj,

Thus, to every cellij of the output table we can assign a set of records from an ISIS database which create an answer to the query qij. Such an answer is the argument of an aggregate function. The following aggregate functions have been implemented:

 In addition there are two functions that work on the table generated

Below we illustrate the concepts by an example:

 

Example

Assume that one wants to produce a table showing how many books have been ordered from particular suppliers in the years 1989, 1990, 1991, 1992, 1992:

  

 

1989

1990

1991

1992

1993

1994

ACM

22

321

33

432

200

12

Springer

10

3

3

2

21

1

Wiley

0

0

11

11

3

4

IEEE

3

3

2

1

3

1

UN

0

0

0

1

0

0

>>TOTAL<<

35

327

49

447

227

18

For the column 1 of this table we have to define a query YP=1989, whereas for the first row we define a query SU=SPRINGER. So for cell221 the system will apply the query:

YP=1989*SU=Springer

On the search result for this query we have to apply the aggregate function COUNT. One can easily notice that the display format for this cell could be

 f(val(v1),4,1)

  

2. Defining Spreadsheet

 Spreadsheets can be defined on standard ISIS worksheets. The basic idea is that for every column we define column title, column query, display format, and the aggregation function along with the function argument. In addition we specify where the column starts. We may define up to 10 columns.

The number of row definitions is also restricted to 10. In this case however one can formulate queries with variable part. So instead of defining a query for every row in the example above we could define one query with a variable {SU=} (by closing "SU=" in the brackets we indicate that SU= is a variable). The variable {SU=} gets values from inverted file prefixed by SU=. In such a way we actually define a class of queries, each built of one term from the inverted file. For each such query a row will appear in the resault table. Only terms starting with the prefix "SU=" are used for creating row queries. The titles of rows should also be variable. They may take consecutive terms, as values from the inverted file (stripped of the prefix).

So for the example above the column queries are YP=1989, YP=1990, etc., whereas the only row query is {SU=}.This means that for each row the column queries are combined with the consecutive terms from the inverted file, say SU=ACM, SU=Springer, etc.

Below a worksheet for a spreadsheet definition is discussed (Figures 1-3).

 

Fig. 1 Spreadsheet definition, page 1

The basic fields are as follows

 Pages 2- 6 are used to define columns parameters. Below we present only Page 2.

 

Fig. 2 Definitions of Columns 2 and 3

For each column we define: 

  • Title defines a title for a column
  • Column format defines a format for the results in all the cells in the column
  • Search expression is a column expression;
  • Aggregation Function defines the operation to be run on the column results;
  • Argument selection defines the field to be taken as argument of the aggregate function; if the function is COUNT, the argument is meaningless. For the arguments like SUM, AVG, etc. we have to specify which value is to be taken into account. The value is created by means of the ISIS Print Format, so it can be quite general.
  •  
  • On Figure 3 we present the screen for the row definitions. Here, we define row title and row expression for at last 10 rows.

  • Fig. 3 Definition of rows

  • 3. Spreadsheet menu options

    When entering the spreadsheet subsystem the database administrator may browse a table with spreadsheet identifiers, each preceded by the database name to which it is applicable. The table may be browsed in a standard way as in case of BROWSE function. In the left window the explanation on functioning of the spreadsheet is displayed.

  • Fig. 4. Selection of the spreadsheet

  • The following options are available:

  • Create spreadsheet;
  • Test spreadsheet - this option activates the validation program to checkout the spreadsheet;
  • Edit. When you position a cursor on a spreadsheet and select the option E, the update process is activated;
  • Run. When you position a cursor on a spreadsheet and select the option R the spreadsheet is activated and a new table is calculated taking into account the lastmost version of the database;
  • Delete - it allows one to delete a worksheet
  • Copy/Rename functions
  • More than one result tables may be attached to the spreadsheet. Each table is identified by a date when the spreadsheet was run. So, one can also analyse tendencies over a given period. The result tables are sorted by date, so that we can realize when the spreadsheet was run.

    The spreadsheet definition process is fairly simple. What is needed it is an experience with ISIS, its print format, FDT and FST structures.