S254 Data Access
The original raw data is in "lmd" format, while all other data files
(rootified raw or higher data generations) are in "root" format.
The S254 software provides two interface classes
- TAGactMbsReader for '.lmd' files
- TAGactTreeReader for '.root' files
In both cases data can either be processed
- from disk (local or NFS mounted)
- directly from stage pool (via RFIO)
Access from Disk
This is the best way to handle frequently used files in interactive work.
In many (hopefully most) cases you'll find the files on our central data
file system /d/kp3. If this is not the case, it will be
necessary to create a private local copy.
Using /d/kp3/tsm
We use the /d/kp3/tsm file tree to hold these frequently
used files. The directory structure simply mirrors the directory structure
used in the mass storage system, just with a /d/kp3/tsm prefix.
So the file with the tsm archive path
/s254/ntu_mar03/ntu_traw_1026.root
will be stored under
/d/kp3/tsm/s254/ntu_mar03/ntu_traw_1026.root
Using local copies
To create a local copy of an archived file use the tsmcli
program with the "retrieve" command. Note that the tsmcli
program by default puts the file in the current working directory,
so first one has to 'cd' into the directory where the file is to
be stored. To create for example a local copy of the archived file
"/s254/gen/run_mar03/raw/run_0993.root" in
/scratch.local/data
enter
cd /scratch.local/data
tsmcli retrieve run_0993.root s254 gen/run_mar03/raw
Note the strange command syntax, for details check "man adsmcli".
When processing more than a single files make sure to
stage the files before you retrieve them.
Also consider whether direct access to the stage pool
via RFIO is more efficient.
Note too:The $HOME file systems should never be used to
store mass storage data copies. Use /tmp or /scratch.local.
The mass storage system has large disk storage pool called 'stage
pool' where recently used data files are cached. The size of this
pool is currently more than one TByte and will increase soon.
With the RFIO protocol it is possible to access those files directly.
The two S254 framework classes TAGactMbsReader and
TAGactTreeReader are RFIO enabled and thus support direct
access of both '.lmd' and '.root' files. Just
use file names of the structure
rfio:gsitsma:<archive_file_name>
and the RFIO protocol will be used to connect to the server
gsitsma (the current tsm server) and access the file.
To read the rootified raw data file of run 993 of the March'03
campaign one has for example to open
rfio:gsitsma:/s254/gen/run_mar03/raw/run_0993.root
It is essential for the efficiency and also long-term reliability
of the mass storage system that the number of tape transactions
is kept at a minimum. Each tape mount costs time (about 3 minutes)
and also wear and tear to the volume and drive.
Whenever more than a single file is to be accessed from the mass
storage system one should therefore stage a whole block
of files prior to using them with the tsmcli command
stage:
tsmcli stage "<file_block>" <archive> <path>
Unfortunately allows tsmcli only simple "*" type wild cards,
so in practice only blocks of 10 or 100 files can be efficiently staged.
To stage for example the rootified raw data files of the runs 990…999
of the March'03 campaign one has to give the command
tsmcli stage "run_099*.root" s254 gen/run_mar03/raw
Note: The wildcarded file name has to be put in quotes to prevent
shell expansion of meta characters.
S254 Documentation
Last modified: Wed Oct 8 15:10:43 CEST 2003
Walter F.J. Müller