The Cluster Science Data System (CSDS) is a network-based system established to assist in the distribution of the vast amount of data expected to be acquired during the Cluster mission. The User Interface (UI), a common software infrastructure, links National Data Centres and the scientific community and permits them to ingest, select and manipulate science data products.
The CSDS UI was developed by a team of ESA establishments and scientific institutes. To a large extent, it is based on existing software in order to minimise the risk and to ensure a rapid development cycle. It was developed over a period of two years. There have been five incremental deliveries, to allow early feedback to be obtained and to enable the development team to cope with requests for changes and new requirements in a controlled way. This method also contributed to the development being kept within the original 'cost at completion'.
Cluster, which together with SOHO consti-tutes the first cornerstone of ESA's Science Programme's Horizon 2000 long-term plan, is a mission consisting of four identical satellites. It will be launched on the first Ariane 5 launch presently scheduled for Spring 1996. The four Cluster satellites will orbit in a tetrahedral formation, with the aim of building for the first time a 3-D picture of the physical processes occurring in the magnetosphere. The Cluster payload consists of 11 instruments on each satellite, each generating raw data to be acquired and processed on ground. The products will then be distributed to the Cluster scientific community.
To do so, a network-based system called the Cluster Science Data System (CSDS) was established. The CSDS includes:
For more information on the CSDS, see 'Collection and Dissemination of Cluster Data' in ESA Bulletin No. 84, November 1995.
The CSDS User Interface (UI) offers the NDCs and the scientific community a number of specialised data products and services.
CSDS data products
The NDCs will make several types
of Cluster data available to the scientific community:
The files are stored in the Common Data Format (CDF), which is a standard for all the projects within the ISTP. With this format, it is possible for the scientists to manipulate the data.
Each NDC is responsible for the 'pipeline process' that generates the PP and SP, i.e. the process used to convert the raw data to data products, for the NDC's own set of instruments. The PP and SP data files for the other instruments must be fetched from the other NDCs.
As mentioned above, each NDC will generate SP and PP CDF files for their own instruments. This data will then have to be validated by the instrument's PI before it is made available to the user community. The CSDS UI permits the PI to insert validation information into the CDF files. Once the CDF files have been validated, the NDC will perform two actions, both done with the use of the CSDS UI:
At this stage, the CDF files are actually available to the scientists that are served by the local NDC, but not to the scientists that are served by the other NDCs. Therefore, all NDCs will use the CSDS UI software everyday to look up the other NDCs, and to pick up the new PP/SP CDF files that are made available. These files are then included in the local SP/PP database, and thus made available to the local scientific community. When the German NDC is looked up, the CSDS UI makes sure to pick up also the new Summary Plot files.
As far as the events catalogues are concerned, the CSDS UI will automatically update the copies stored in the NDCs with the newest version from the master catalogues in the JSOC.
The CSDS UI permits the NDCs to manage their own user community. They can register users and give access rights. The access rights can be given both in terms of data type (PP and SP data), as well as instrument and observation time interval for the PP data.
Lastly, the CSDS UI provides utilities to manage the databases in the NDC (remove data, backup data, etc.) and to log user access.
Once the data is in the NDC, the Cluster scientists served by the NDC can access it. The CSDS UI permits the scientist to:
In this section, the CSDS UI is presented on a technical level. A view of the environment in which the software will work is first presented, and then the different modules that make up the system are explained.
Figure 1 shows how the CSDS UI fits in with the pipeline process for generating science data in a local National Data Centre, and how the data flows through the system.
Figure 1. The CSDS User Interface's main components and its interfaces
Data generated by the four Cluster spacecraft undergoes several stages of transfer and processing before being made available to scientific users as PP and SP databases through the NDCs. Raw data is distributed, on a regular basis, to NDCs. Each NDC generates the PP and SP CDF files for the instruments related to that NDC. These non-validated PP and SP CDF files constitute the interface between the local NDC pipeline process and the CSDS UI.
The events catalogues are regularly updated in each NDC from the master catalogues in JSOC. This is an unattended process which does not require any regular interaction by the NDC personnel.
Three categories of users, with different roles, make use of the CSDS-UI:
The PI/CoIs perform the data validation.
NDC system managers perform CSDS-UI system management tasks in the NDC and provide support to both PI/CoIs and scientific users registered to the NDC. Their main tasks are:
Scientific users are provided with a set of client-server applications to:
The CSDS-UI application is based on a client-server architecture, available for both Sun Solaris and DEC Alpha OpenVMS operating systems. CSDS-UI communications rely on the TCP/IP protocol. The following types of carriers are supported:
This configuration is depicted in Figure 2.
Figure 2. Interconnection of NDCs and users
CSDS user interfaces are based on Motif (Graphic User Interface or GUI) or, for batch processing or low network bandwidth, on a character-based interface (Command Line Interface or CLI). Catalogues and related query functions, and user registration were developed using Oracle products. Oracle is also the basis for the PP/SP catalogues and the events catalogues.
Data validation
The 'Data Validation' application
allows the PI/CoI to validate PP and SP CDF data files produced
by the pipeline process. Through a GUI (Fig. 3), it allows CDF file
validation attributes to be entered and modified, and the
incremental CDF file version numbering scheme to be managed. Data
validation is run locally, at each NDC.
Figure 3. The GUI for the 'Data Validation' application
NDC system manager applications
The NDC system
manager is responsible for the management of the data products,
catalogues and users' access rights, and for the NDC's activity
monitoring.
Data files and catalogue management
On a routine basis,
the NDC system manager fetches, from others NDCs, data files and
in particular the validated CDF files for the instruments not
related to the manager's own NDC. This function is provided by
the DC-to-DC file transfer application.
Once available, remotely or locally produced data files are ingested in the PP and SP database and are catalogued by extracting the relevant information from the data files. This process is shown in Figure 4.
Figure 4. The NDC system manager is responsible for the
management of the data products: DC-to-DC file transfer, data
ingestion and catalogue loading
In addition to populating the files database and updating the related catalogues, the NDC system manager is required to manage them. The 'Catalogue Management' application allows the NDC system manager to up-date the catalogues in order to reflect any manual modification performed to the data-bases such as a deletion or relocation of a data file. The Catalogue Management function is provided through a GUI built on the Oracle Form product. The GUI is shown in Figure 5.
Figure 5. The GUI for 'Catalogue Management', used by an NDC system manager
Users and access rights management
Whereas PI/CoIs have full
access rights for all instrument data files and catalogues, other
users, by default, may access only SP data files and catalogues.
Whenever needed, for example during a scientific campaign, the
access rights of a non-PI user can be extended to the PP data by
granting the user one or more 'campaigns'. A campaign is defined
as a period of observation time and a list of instruments for
which access to the PP data files and related catalogues is
allowed.
In agreement with the PIs, the NDC system manager takes care of the data access security aspects by managing the campaigns, the users and access rights data-bases. These functions are provided by the 'User and Access Rights' application based on Oracle Forms.
NDC's monitoring of activities
All applications produce
log information either as flat files or into the Oracle database.
The NDC system manager analyses these logs to obtain valuable
information about the NDC server behaviour. In particular, the
system manager can appreciate how the users' activities affect
the NDC server and tune the configuration accordingly. Log
information is also useful for identifying and fixing users'
problems.
The CSDS-UI registered user is provided with a set of applications to access and manipulate the data that is located at the local NDC or has already been retrieved on his computer. All these applications have a GUI and most of them rely on a client-server architecture.
The user applications are:
They are reached via the 'Session Manager' window, which provides a menu of the applications available.
The catalogue browser application
The catalogue browser allows
the user to interactively search through the catalogues. Three
different catalogue types are available (Table 1). The catalogue
browser is based on the Oracle Forms product.
Catalogue type Contents
PP/SP CDF data files DF file information: name, date of the data measurement, file version, size, creation date, etc. Data interval information: start time and end time, instrument mode, percentage of bad data, duration of gaps, etc. (There is one catalogue per instrument and data type, i.e. 21 catalogues.) Orbit geometry Sampled data from the auxiliary CDF data files, which contain data related to the quality and shape of the tetrahedron formed by the four Cluster spacecraft Event catalogues Predicted solar cycle trends Predicted geometric positions Predicted scientific events Predicted scientific positions Scientific events
From the catalogue browser, the user can save and re-load a search definition and save, as an ASCII file, a search result. From the PP and SP catalogues only, it is possible to fetch, merge or subset CDF files (Fig. 6) and retrieve the result file to the user's computer (Fig. 7). Lastly, a query on PP or SP catalogues can be saved for re-use by the data manipulation application, ISDAT (see below); this constitutes the ISDAT interface to the CDF data files catalogues.
Figure 6. The GUI for the PP/SP query made via the catalogue browser
Figure 7. The GUI for the PP/SP result obtained via the catalogue browser
The data manipulation application ISDAT
The Interactive
Science Data Analysis Tool, ISDAT, is the CSDS-UI scientific data
manipulation package. The system allows the scientists to use
different applications running on their own machines (the
clients) to select and plot data that reside on the NDC machine
(the server). An overview of the ISDAT architecture is shown in
Figure 8.
Figure 8. ISDAT client-server architecture
ISDAT allows the scientific user to select data from the PP and SP CDF files on the basis of the instrument, time interval or another parameter of interest. The selected data can then be displayed, by means of a specialised client. The clients display data in different ways, for example, one produces graphical plots (the cuigr client) while another shows the metadata of the CDF files (the cuimeta client).
All of the CDF files are indexed as part of the 'Data Ingestion' application, so the system provides quick access to the data. In addition, the ISDAT server verifies the user data access rights by interfacing with the Oracle user database in the NDC.
Several ISDAT clients are available. They are initially invoked via the ISDAT Time Manager client. The user specifies the time interval to be viewed. All of the clients 'owned' by that Time Manager will then display data for that time interval (although the type of data displayed will be different), allowing for example one client (the cuigr) to plot an instrument parameter while another client (the cuimeta) shows the metadata for the same instrument over the same period.
A few examples of the most important ISDAT clients are:
Figure 9. The GUI for the ISDAT graphic client
The simple display
The 'Simple Display' application provides a
quick way for the users to view the CDF files that they have
retrieved from their NDC, but it does not provide any data
manipulation facilities like ISDAT. The GUI for this application
is shown in Figure 10.
Figure 10. The GUI for the simple display
The Summary Plot Browser
The 'Summary Plot Browser' provides
the user with a GUI to browse and retrieve, from the local NDC,
SP files.
History
Since the CSDS
Announcement of Opportunity in 1990, the Agency was tasked with
providing a software environment that the Cluster scientific
users could use to access the data. At that time, the ESA-funded
European Space Information System (ESIS) was the preferred
environment. It was under development, with the Pilot Phase to
be approved at the end of 1993. It became clear at the beginning
of 1994, however, that the future of ESIS looked very uncertain (*),
and the Cluster Project decided to take urgent action to recover
the situation.
(*) A few months later, in June 1994, ESA's Science Programme Committee (SPC) decided to transfer ESIS to the scientific institutions. ESA no longer supports the development of ESIS.
A 'tiger team', composed of staff from ESA and the Data Centres, was set up to address the situation. A User Requirements document, containing the specifications for the Cluster community's requirements, was prepared with the joint contribution of all the affected parties: the end-users, ESA and the Data Centres. On the basis of that document, the tiger team reviewed the existing and planned software systems available within the Cluster community, and identified those most suitable for the provision of the required services.
The ESIS software, provided by ESRIN, and the ISDAT package, provided by the Swedish Institute of Space Physics (IRF-U), were identified as the best candidates to be the building blocks for a new CSDS User Interface(**). These recommendations were presented to, and accepted by, the CSDS Steering Committee and the Implementation Working Group. The project team was then finally set-up and the cost at completion agreed.
(**) As the reader may have realised from the technical description, the term 'User Interface' was probably too limited to define the overall software that provides much more functionality to end-users and Data Centres, than just the interface to data.
Thanks to a goal-oriented management approach, good collaboration, and the joint effort of all the entities involved, the CSDS-UI project team was already in place before the end of May 1994 and was working towards Release 1 of the system.
The major project activities were:
The project team and the division of work
The CSDS-
UI project was carried out by ESA, with the support of
contractors Rutherford Appleton Laboratory (RAL, UK), Queen Mary
and Westfield College (UK), and IRF-U. With respect to ESA's
role, ESTEC performed the project management, while ESRIN was
responsible for the development of the following modules:
ESRIN was also responsible for the overall CSDS-UI integration, testing and deployment to the Data Centres, and will also provide user support and maintenance during the operation of the Cluster satellites.
RAL was responsible for the development, integration and testing of the following modules:
IRF-U was responsible for the adaptation, improvement and extension of the ISDAT package to meet the CSDS-UI requirements for the scientific analysis of SPs and PPs.
Queen Mary and Westfield College acted as the consultant for all the aspects pertaining to the CDF.
Budget
The overall project development has remained
within the original cost at completion of 1730 KAU. (This figure
includes only direct costs. It does not include about 200 KAU of
ESA staff costs.) The 270 KAU originally kept to cover
contingencies has been used to include new functionalities as
requested by the users.
Schedule
The project adopted an Incremental
Delivery Approach (Fig. 11). Three software deliveries were envisaged. The formal
acceptance of the software has been applied however to the final
version only.
Figure 11. Schedule for the CSDS User Interface project
The first delivery (Release 1) took place in July 1994, just three months after the official start of the project. This release mainly addressed the system management functionalities made available to the Data Centres.
The second delivery (Release 2) was distributed to Data Centres in February 1995. It contained all the Data Centre functionalities and most of the functionalities for the end-user. The Data Centres evaluated this version, and the feedback was reported to the development team. Moreover, this version gave the the Data Centres the opportunity to become acquainted with the various installation procedures.
Finally, in July 1995, after having passed a formal acceptance test procedure with the Cluster project office, the final version (Release 3) was made available to the Data Centres and to the end-users.
The Data Centres, in turn, were in charge of re-executing the acceptance procedures before formally accepting the software. This activity, carried out during the past summer, led to the proposal of several new features to be implemented, and identified some critical limitations in the performances.
Based on these results, a new version (Release 4) was prepared and distributed in December 1995. The user community has introduced some more new requirements and ESA is now planning a fifth version for release in April 1996, in conjunction with the new launch date for the mission.
Maintenance of the UI
Throughout the life of the
Cluster mission, i.e. until mid-1998, ESA will provide regular
maintenance to the User Interface. Procedures to collect the
Software Problem Reports from end-users and Data Centres have
been put in place and will be implemented on a fixed-effort base
(requiring about two person-years per year).
The CSDS-UI project is the practical demonstration of an efficient way to develop the infrastructure to distribute the data from ESA scientific missions at a relatively low cost and within a strict schedule. A few basic principles that may be easily adopted in other, similar projects, have been followed: