S254 Data Access

The original raw data is in "lmd" format, while all other data files (rootified raw or higher data generations) are in "root" format. The S254 software provides two interface classes In both cases data can either be processed

Access from Disk

This is the best way to handle frequently used files in interactive work. In many (hopefully most) cases you'll find the files on our central data file system /d/kp3. If this is not the case, it will be necessary to create a private local copy.

Using /d/kp3/tsm

We use the /d/kp3/tsm file tree to hold these frequently used files. The directory structure simply mirrors the directory structure used in the mass storage system, just with a /d/kp3/tsm prefix. So the file with the tsm archive path
    /s254/ntu_mar03/ntu_traw_1026.root
will be stored under
    /d/kp3/tsm/s254/ntu_mar03/ntu_traw_1026.root

Using local copies

To create a local copy of an archived file use the tsmcli program with the "retrieve" command. Note that the tsmcli program by default puts the file in the current working directory, so first one has to 'cd' into the directory where the file is to be stored. To create for example a local copy of the archived file "/s254/gen/run_mar03/raw/run_0993.root" in /scratch.local/data enter
    cd /scratch.local/data
    tsmcli retrieve run_0993.root s254 gen/run_mar03/raw
Note the strange command syntax, for details check "man adsmcli". When processing more than a single files make sure to stage the files before you retrieve them. Also consider whether direct access to the stage pool via RFIO is more efficient.
Note too:The $HOME file systems should never be used to store mass storage data copies. Use /tmp or /scratch.local.

Access from Stage Pool

The mass storage system has large disk storage pool called 'stage pool' where recently used data files are cached. The size of this pool is currently more than one TByte and will increase soon. With the RFIO protocol it is possible to access those files directly. The two S254 framework classes TAGactMbsReader and TAGactTreeReader are RFIO enabled and thus support direct access of both '.lmd' and '.root' files. Just use file names of the structure
    rfio:gsitsma:<archive_file_name>
and the RFIO protocol will be used to connect to the server gsitsma (the current tsm server) and access the file.
To read the rootified raw data file of run 993 of the March'03 campaign one has for example to open
    rfio:gsitsma:/s254/gen/run_mar03/raw/run_0993.root

Staging Data

It is essential for the efficiency and also long-term reliability of the mass storage system that the number of tape transactions is kept at a minimum. Each tape mount costs time (about 3 minutes) and also wear and tear to the volume and drive.

Whenever more than a single file is to be accessed from the mass storage system one should therefore stage a whole block of files prior to using them with the tsmcli command stage:

     tsmcli stage "<file_block>" <archive> <path>
Unfortunately allows tsmcli only simple "*" type wild cards, so in practice only blocks of 10 or 100 files can be efficiently staged. To stage for example the rootified raw data files of the runs 990…999 of the March'03 campaign one has to give the command
     tsmcli stage "run_099*.root" s254 gen/run_mar03/raw
Note: The wildcarded file name has to be put in quotes to prevent shell expansion of meta characters.

S254 Documentation


Last modified: Wed Oct 8 15:10:43 CEST 2003
Walter F.J. Müller