cta.tex 70.1 KB
 Daniele Kruse committed Jun 18, 2015 1 2 3 4 \documentclass[10pt,a4paper]{report} \usepackage[T1]{fontenc} \usepackage{graphicx} \usepackage[top=2cm, bottom=2cm, left=2cm, right=2cm]{geometry}  5 6 \usepackage{multirow} \usepackage[table]{xcolor}  Daniele Kruse committed Jun 25, 2015 7 \usepackage{parskip}  Eric Cano committed Oct 16, 2015 8 \usepackage{moreverb}  Eric Cano committed Jun 21, 2016 9 10 11 12 \usepackage{tikz} % Note about the package tikz-uml: This package is not part of some latex distributions (like tex live), and we are using a recent copy of it. It can be downloaded from the author's web page: % http://perso.ensta-paristech.fr/~kielbasi/tikzuml/ \usepackage{tikz-uml}  Eric Cano committed Sep 05, 2016 13 \usepackage{SIunits}  Eric Cano committed Jun 21, 2016 14 \usetikzlibrary{positioning}  Daniele Kruse committed Jun 18, 2015 15 16 17 18 19 20 21 22  \begin{document} \title{The CERN Tape Archive} \author{German Cancio, Eric Cano, Daniele Kruse and Steven Murray} \maketitle  23 24 25 26 27 28 29 30 \chapter{Introduction} The main objective of the CTA project is to develop a prototype tape archive system that transfers files directly between remote disk storage systems and tape drives. The concrete remote storage system of choice is EOS. The data and storage services (DSS) group of the CERN IT department currently provides a tape archive service. This service is implemented by the hierarchical storage management (HSM) system named the CERN advanced storage manager (CASTOR). This HSM has an internal disk-based storage area that acts as a staging area for tape drives. Until now this staging area has been a vital component of CASTOR. It has provided the necessary buffer between the multi-stream, block-oriented disk drives of end users and the single-stream, file-oriented tape drives of the central tape system. Assuming the absence of a sophisticated disk to tape scheduling system, at any single point in time a disk drive will be required to service multiple data streams whereas a tape drive will only ever have to handle a single stream. This means that a tape stream will be at least one order of magnitude faster than a disk stream. With the advent of disk storage solutions that stripe single files over multiple disk servers, the need for a tape archive system to have an internal disk-based staging area has become redundant. Having a file striped over multiple disk servers means that all of these disk-servers can used in parallel to transfer that file to a tape drive, hence using multiple disk-drive streams to service a single tape stream. The CTA project is a prototype for a very good reason. The DSS group needs to investigate and learn what it means to provide a tape archive service that does not have its own internal disk-based staging area. The project also needs to keep its options open in order to give the DSS group the best opportunities to identify the best ways forward for reducing application complexity, easing code maintenance, reducing operation overheads and improving tape efficiency.  Eric Cano committed Jul 15, 2015 31 The CTA project currently has no constraints that go against collecting a global view of all tape , drive and user request states. This means the CTA project should be able to implement intuitive and effective tape scheduling policies. For example it should be possible to schedule a tape archive mount at the point in time when there is both a free drive and a free tape. The architecture of the CASTOR system does not facilitate such simple solutions due to its history of having separate staging areas per experiment and dividing the mount scheduling problem between these separate staging areas and the central tape system responsible for issuing tape mount requests for all experiments.  32   Daniele Kruse committed Jun 21, 2016 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 \chapter{CTA basic concepts} CTA is operated by authorized administrators (AdminUsers) who issue CTA commands from authorized machines (AdminHosts), using the CTA command line interface. All administrative metadata (such as tape, tape pools, storage classes, etc..) is tagged with a "creationLog" and a "lastModificationLog" which say who/when/where created them and last modified them. An administrator may create ("add"), delete ("rm"), change ("ch") or list ("ls") any of the administrative metadata. Tape pools are logical groupings of tapes that are used by operators to separate data belonging to different VOs. They are also used to categorize types of data and to separate multiple copies of files so that they end up in different buildings. Each tape belongs to one and only one tape pool. Logical libraries are the concept that is used to link tapes and drives together. We use logical libraries to specify which tapes are mountable into which drives, and normally this mountability criteria is based on location, that is the tape has to be in the same physical library as the drive, and on read/write compatibility. Each tape and each drive has one and only one logical library. The storage class is what we assign to each archive file to specify how many tape copies the file is expected to have. Archive routes link storage classes to tape pools. An archive route specifies onto which set of tapes the copies will be written. There is an archive route for each copy in each storage class, and normally there should be a single archive route per tape pool. So to summarize, an archive file has a storage class that determines how many copies on tape that file should have. A storage class has an archive route per tape copy to specify into which tape pool each copy goes. Each tape tool is made of a disjoint set of tapes. And tapes can be mounted on drives that are in their same logical library. \section{Archiving a file with CTA} CTA has a CLI for archiving and retrieving files to/from tape, that is meant to be used by an external disk-based storage system with an archiving workflow engine such as EOS. A non-administrative "User" in CTA is an EOS user which triggers the need for archiving or retrieving a file to/from tape. A user normally belongs to a specific CTA "mount group", which specifies the mount criteria and limitations (together called "mount policy") that trigger a tape mount. Here we offer a simplified description of the archive process: \begin{enumerate} \item EOS issues an archive command for a specific file, providing its source path, its storage class (see above), and the user requesting the archival \item CTA returns immediately an "ArchiveFileID" which is used by CTA to uniquely identify files archived on tape. This ID will be kept by EOS for any operations on this file (such as retrieval) \item Asynchronosly, CTA carries out the archival of the file to tape, in the following steps: \begin{itemize} \item CTA looks up the storage class provided by EOS and makes sure it has correct routings to one or more tape pools (more than one when multiple copies are required by the storage class) \item CTA queues the corresponding archive job(s) to the proper tape pool(s) \item in the meantime each free tape drive queries the central "scheduler" for work to be done, by communicating its name and its logical library \item for each work request CTA checks whether there is a free tape in the required pool (as specified in b.), that belongs to the desired logical library (as specified in c.) \item if that is the case, CTA checks whether the work queued for that tape pool is worth a mount, i.e. if it meets the archive criteria specified in the mount group to which the user (as specified in 1.) belongs \item if that is the case, the tape is mounted in the drive and the file gets written from the source path specified in 1. to the tape \item after a successful archival CTA notifies EOS through an asynchronous callback \end{itemize} \end{enumerate} An archival process can be canceled at any moment (even after correct archival, but in this case it's a "delete") through the "delete archive" command \section{Retrieving a file with CTA} Here we offer a simplified description of the retrieve process: \begin{enumerate} \item EOS issues a retrieve command for a specific file, providing its ArchiveFileID and desired destination path, and the user requesting the retrieval \item CTA returns immediately \item Asynchronosly, CTA carries out the retrieval of the file from tape, in the following steps: \begin{itemize} \item CTA queues the corresponding retrieve job(s) to the proper tape(s) (depending on where the tape copies are located) \item in the meantime each free tape drive queries the central "scheduler" for work to be done, by communicating its name and its logical library \item for each work request CTA checks whether the logical library (as specified in b.) is the same of (one of) the tape(s) (as specified in a.) \item if that is the case, CTA checks whether the work queued for that tape is worth the mount, i.e. if it meets the retrieve criteria specified in the mount group to which the user (as specified in 1.) belongs \item if that is the case, the tape is mounted in the drive and the file gets read from tape to the destination specified in 1. \item after a successful retrieval CTA notifies EOS through an asynchronous callback \end{itemize} \end{enumerate} A retrieval process can be canceled at any moment prior to correct retrieval through the "cancel retrieve" command \chapter{EOS-CTA Authorization Guidelines} One of the requirements of CTA is to limit the crosstalk among different EOS instances. In more detail: \begin{enumerate} \item A listStorageClass command should return the list of storage classes belonging to the instance from where the command was executed only \item A queueArchive command should be authorized only if: \begin{itemize} \item the instance provided in the command line coincides with the instance from where the command was executed \item the storage class provided in the command line belongs to the instance from where the command was executed \item the EOS username and/or group (of the original archive requester) provided in the command line belongs to the instance from where the command was executed \end{itemize} \item A queueRetrieve command should be authorized only if: \begin{itemize} \item the instance of the requested file coincides with the instance from where the command was executed \item the EOS username and/or group (of the original retrieve requester) provided in the command line belongs to the instance from where the command was executed \end{itemize} \item A deleteArchive command should be authorized only if: \begin{itemize} \item the instance of the file to be deleted coincides with the instance from where the command was executed \item the EOS username and/or group (of the original delete requester) provided in the command line belongs to the instance from where the command was executed \end{itemize} \item A cancelRetrieve command should be authorized only if: \begin{itemize} \item the instance of the file to be canceled coincides with the instance from where the command was executed \item the EOS username and/or group (of the original cancel requester) provided in the command line belongs to the instance from where the command was executed \end{itemize} \item An updateFileStorageClass command should be authorized only if: \begin{itemize} \item the instance of the file to be updated coincides with the instance from where the command was executed \item the storage class provided in the command line belongs to the instance from where the command was executed \item the EOS username and/or group (of the original update requester) provided in the command line belongs to the instance from where the command was executed \end{itemize} \item An updateFileInfo command should be authorized only if: \begin{itemize} \item the instance of the file to be updated coincides with the instance from where the command was executed \end{itemize} \end{enumerate} \chapter{CTA-EOS Reconciliation Strategy} \section{Reconciling EOS file info and CTA disk file info} This should be the most common scenario causing discrepancies between the EOS namespace and the disk file info within the CTA catalogue. The proposal is to attack this in two ways: first (already done) we piggyback disk file info on most commands acting on CTA Archive files ("archive", "retrieve", "cancelretrieve", etc.), second (to be agreed with Andreas) EOS could have a trigger on file renames or other file information changes (owner, group, path, etc.) that calls our updatefileinfo command with the updated fields. In addition (also to be agreed with Andreas) there should also be a separate low priority process (a sort of EOS-side reconciliation process) going through the entire EOS namespace periodically calling updatefileinfo on each of the known files, we would also store the date when this update function was called (see below to know why). \section{Reconciling EOS deletes which haven't been propagated to CTA} Say that the above EOS-side low-priority reconciliation process takes on average 3 months and it is run continuously. We could use the last reconciliation date to determine the list of possible candidates of files which EOS does not know about anymore, by just taking the ones which haven't been updated say in the last 6 months. Since we have the EOS instance name and EOS file id for each file (and Andreas confirmed that IDs are unique and never reused within a single instance), we can then automatically check (through our own CTA-side reconciliation process) whether indeed these files exist or not. For the ones that still exist we notify EOS admins for a possible bug in their reconciliation process and we ask them to issue the updatefileinfo command, for the ones which don't exist anymore we double check with their owners before deleting them from CTA. Note: It's important to note that we do not reconcile storage class information. Any storage class change is triggered by the EOS user and it is synchronous: once we successfully record the change our command returns. \chapter{CTA-EOS command line interface} EOS communicates with CTA by issuing commands on trusted hosts. EOS can archive a file, retrieve it, update its information/storage class, delete it or simply list the available storage classes. See the LimitingInstanceCrosstalk.txt file for more details on how these commands are authorized by CTA. \section{ARCHIVING from EOS to CTA} \begin{verbatim} 1) EOS REQUEST: cta a/archive --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --user // string name of the requester of the action (archival), // used for SLAs and logging, // not kept by CTA after successful operation --group // string group of the requester of the action (archival), // used for SLAs and logging, // not kept by CTA after successful operation --diskid // string disk id of the file to be archived, // kept by CTA for reconciliation purposes --instance // string kept by CTA for authorizing the request // and for disaster recovery --srcurl // string source URL of the file to archive of // the form scheme://host:port/opaque_part, // not kept by CTA after successful archival --size // uint64_t size in bytes kept by CTA for // correct archival and disaster recovery --checksumtype // string checksum type (ex. ADLER32) kept by CTA // for correct archival and disaster recovery --checksumvalue // string checksum value kept by CTA for correct // archival and disaster recovery --storageclass // string that determines how many copies and // which tape pools will be used for archival // kept by CTA for routing and authorization --diskfilepath // string the disk logical path kept by CTA // for disaster recovery and for logging --diskfileowner // string owner username kept by CTA // for disaster recovery and for logging --diskfilegroup // string owner group kept by CTA // for disaster recovery and for logging --recoveryblob // 2KB string kept by CTA for disaster recovery // (opaque string controlled by EOS) --diskpool // string used (and possibly kept) // by CTA for proper drive allocation --throughput // uint64_t (in bytes) used (and possibly kept) // by CTA for proper drive allocation 2) CTA IMMEDIATE REPLY: CTA_ArchiveFileID or Error CTA_ArchiveFileID: string which is the unique ID of the CTA file to be kept by EOS while file exists (for future retrievals). In case of retries, a new ID will be given by CTA (as if it was a new file), the old one can be discarded by EOS. 3) CTA CALLBACK WHEN ARCHIVED SUCCESSFULLY: src_URL and copy_number with or without Error src_URL: this is the same string provided in the EOS archival request copy_number: indicates which copy number was archived note: if multiple copies are archived there will be one callback per copy \end{verbatim} \section{RETRIEVING from CTA to EOS} \begin{verbatim} 1) EOS REQUEST: cta r/retrieve --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --user // string name of the requester of the action (retrieval), // used for SLAs and logging, // not kept by CTA after successful operation --group // string group of the requester of the action (retrieval), // used for SLAs and logging, // not kept by CTA after successful operation --id // uint64_t which is the unique ID of the CTA file --dsturl // string of the form scheme://host:port/opaque_part, // not kept by CTA after successful operation --diskfilepath // string the disk logical path kept by CTA // for disaster recovery and for logging --diskfileowner // string owner username kept by CTA for // disaster recovery and for logging --diskfilegroup // string owner group kept by CTA for disaster // recovery and for logging --recoveryblob // 2KB string kept by CTA for disaster recovery // (opaque string controlled by EOS) --diskpool // string used (and possibly kept) by CTA for // proper drive allocation --throughput // uint64_t (in bytes) used (and possibly kept) // by CTA for proper drive allocation Note: disk info is piggybacked 2) CTA IMMEDIATE REPLY: Empty or Error 3) CTA CALLBACK WHEN RETRIEVED SUCCESSFULLY: dst_URL with or without Error dst_URL: this is the same string provided in the EOS retrieval request \end{verbatim} \section{DELETING an ARCHIVE FILE} \begin{verbatim} 1) EOS REQUEST: cta da/deletearchive --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --user // string name of the requester of the action (deletion), // used for SLAs and logging, // not kept by CTA after successful operation --group // string group of the requester of the action (deletion), // used for SLAs and logging, // not kept by CTA after successful operation --id // uint64_t which is the unique ID of the CTA file Note: This command may be issued even before the actual archival process has begun 2) CTA IMMEDIATE REPLY: Empty or Error \end{verbatim} \section{CANCELING a SCHEDULED RETRIEVAL} \begin{verbatim} 1) EOS REQUEST: cta cr/cancelretrieve --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --user // string name of the requester of the action (cancel), // used for SLAs and logging, // not kept by CTA after successful operation --group // string group of the requester of the action (cancel), // used for SLAs and logging, // not kept by CTA after successful operation --id // uint64_t which is the unique ID of the CTA file --dsturl // this is the same string provided in the EOS // retrieval request --diskfilepath // string the disk logical path kept by CTA for // disaster recovery and for logging --diskfileowner // string owner username kept by CTA for disaster // recovery and for logging --diskfilegroup // string owner group kept by CTA for disaster // recovery and for logging --recoveryblob // 2KB string kept by CTA for disaster recovery // (opaque string controlled by EOS) Note: This command will succeed ONLY before the actual retrieval process has begun Note: disk info is piggybacked 2) CTA IMMEDIATE REPLY: Empty or Error \end{verbatim} \section{UPDATE the STORAGE CLASS of a FILE} \begin{verbatim} 1) EOS REQUEST: cta ufsc/updatefilestorageclass --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --user // string name of the requester of the action (update), // used for SLAs and logging, // not kept by CTA after successful operation --group // string group of the requester of the action (update), // used for SLAs and logging, // not kept by CTA after successful operation --id // uint64_t which is the unique ID of the CTA file --storageclass // updated storage class which may or may not have // a different routing --diskfilepath // string the disk logical path kept by CTA for // disaster recovery and for logging --diskfileowner // string owner username kept by CTA for disaster // recovery and for logging --diskfilegroup // string owner group kept by CTA for disaster // recovery and for logging --recoveryblob // 2KB string kept by CTA for disaster recovery // (opaque string controlled by EOS)  Daniele Kruse committed Jun 21, 2016 404 405 Note: This command DOES NOT change the number of tape copies! The number will change asynchronously (next repack or "reconciliation").  Daniele Kruse committed Jun 21, 2016 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 Note: disk info is piggybacked 2) CTA IMMEDIATE REPLY: Empty or Error \end{verbatim} \section{UPDATE INFO of a FILE} \begin{verbatim} 1) EOS REQUEST: cta ufi/updatefileinfo --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --id // uint64_t which is the unique ID of the CTA file --diskfilepath // string the disk logical path kept by CTA for // disaster recovery and for logging --diskfileowner // string owner username kept by CTA for disaster // recovery and for logging --diskfilegroup // string owner group kept by CTA for disaster // recovery and for logging --recoveryblob // 2KB string kept by CTA for disaster recovery // (opaque string controlled by EOS)  Daniele Kruse committed Jun 21, 2016 434 435 Note: This command is not executed on behalf of an EOS user. Instead it is part of a resynchronization process initiated by EOS.  Daniele Kruse committed Jun 21, 2016 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459  2) CTA IMMEDIATE REPLY: Empty or Error \end{verbatim} \section{LISTING all STORAGE CLASSES available} \begin{verbatim} 1) EOS REQUEST: cta lsc/liststorageclass --encoded <"true" or "false"> // true if all following arguments are base64 encoded, // false if all following arguments are in clear // (no mixing of encoded and clear arguments) --user // string name of the requester of the action (listing), // used for SLAs and logging, // not kept by CTA after successful operation --group // string group of the requester of the action (listing), // used for SLAs and logging, // not kept by CTA after successful operation 2) CTA IMMEDIATE REPLY: storage class list \end{verbatim}  460 461 462 463 464 465 466 467 468 469 \chapter{Getting the prototype up and running} This chapter explains how to install the CTA prototype together with a local EOS instance on a single local development box. \section{Install a local EOS instance} The CTA project requires xroot version 4 or higher. EOS depends on xroot and therefore the EOS version used must also be compatible with xroot version 4 or higher. An example combination of EOS and xroot versions compatible with the  470 CTA project are EOS version 4.0.4 Citrine together with xroot version 4.2.3-1.  471 472 473 474 475 476 477  \subsection{Configure yum to be able to find the correct EOS and xroot rpms} For the EOS rpms create the \texttt{/etc/yum.repos.d/eos.repo} file with the following contents. \begin{verbatim} [eos-citrine]  478 479  name=EOS 4.0 Version baseurl=http://dss-ci-repo.web.cern.ch/dss-ci-repo/eos/citrine/tag/el-6/x86_64/  480 481 482 483  gpgcheck=0 enabled=1 \end{verbatim}  Eric Cano committed Oct 16, 2015 484 For the xroot rpms create the \texttt{/etc/yum.repos.d/epel.repo} file with the  485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 following contents. \begin{verbatim} [epel] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN baseurl=http://linuxsoft.cern.ch/epel/6/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=1 protect=0 \end{verbatim} \begin{verbatim} [epel-debug] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - debug RPMs baseurl=http://linuxsoft.cern.ch/epel/6/$basearch/debug/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 \end{verbatim} \begin{verbatim} [epel-source] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - source RPMs baseurl=http://linuxsoft.cern.ch/epel/6/SRPMS/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 \end{verbatim} \begin{verbatim} [epel-testing] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - testing baseurl=http://linuxsoft.cern.ch/epel/testing/6/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 \end{verbatim} \begin{verbatim} [epel-testing-debug] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - testing debug RPMs baseurl=http://linuxsoft.cern.ch/epel/testing/6/$basearch/debug/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 \end{verbatim} \begin{verbatim} [epel-testing-source] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - testing source RPMs baseurl=http://linuxsoft.cern.ch/epel/testing/6/SRPMS/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 \end{verbatim} \subsection{Install the EOS and \texttt{xrootd} rpms} Install the rpms using yum. \begin{verbatim} sudo yum install eos-client eos-server xrootd-client xrootd-debuginfo xrootd-server \end{verbatim} Here is an example list of succesfully installed EOS and \texttt{xrootd} rpms. \begin{verbatim} rpm -qa | egrep 'eos|xrootd' | sort  561 562  eos-client-4.0.4-citrine.slc6.x86_64 eos-server-4.0.4-citrine.slc6.x86_64  563 564 565 566 567 568 569 570 571 572 573 574 575 576  xrootd-4.2.3-1.el6.x86_64 xrootd-client-4.2.3-1.el6.x86_64 xrootd-client-libs-4.2.3-1.el6.x86_64 xrootd-debuginfo-4.1.1-1.slc6.x86_64 xrootd-libs-4.2.3-1.el6.x86_64 xrootd-python-4.2.3-1.el6.x86_64 xrootd-selinux-4.2.3-1.el6.noarch xrootd-server-4.2.3-1.el6.x86_64 xrootd-server-libs-4.2.3-1.el6.x86_64 \end{verbatim} \subsection{Setup the EOS \texttt{sysconfig} file} Create the \texttt{/etc/syconfig/eos} file based on the example installed by the  Steven Murray committed Oct 13, 2015 577 .\texttt{eos-server} rpm:  578 579 580 581 582 583 584 585 \begin{verbatim} sudo cp /etc/sysconfig/eos.example /etc/sysconfig/eos \end{verbatim} Reduce the \texttt{xrootd} daemon roles to the bare minimum of just \texttt{mq}, \texttt{mgm} and \texttt{fst}. This means there will be a total of three \texttt{xrootd} daemons running for EOS on the local development box. \begin{verbatim}  Steven Murray committed Oct 13, 2015 586 XRD_ROLES="mq mgm fst"  587 588 \end{verbatim}  Steven Murray committed Oct 13, 2015 589 Set the name of the EOS instance, for example.  590 \begin{verbatim}  Steven Murray committed Oct 13, 2015 591 export EOS_INSTANCE_NAME=eoscta  592 593 594 \end{verbatim} Replace all of the hostnames with the fully qualified hostname of the local  Steven Murray committed Oct 13, 2015 595 596 597 development box. The resulting hostname entries should look something like the following, where \texttt{devbox.cern.ch} should be replaced with the fully qualified name of the development box where EOS is being installed.  598 \begin{verbatim}  Steven Murray committed Oct 13, 2015 599 600 601 602 603 604 605 606 607 608 export EOS_INSTANCE_NAME=eoscta export EOS_BROKER_URL=root://devbox.cern.ch:1097//eos/ export EOS_MGM_MASTER1=devbox.cern.ch export EOS_MGM_MASTER2=devbox.cern.ch export EOS_MGM_ALIAS=devbox.cern.ch export EOS_FUSE_MGM_ALIAS=devbox.cern.ch export EOS_FED_MANAGER=devbox.cern.ch:1094 export EOS_TEST_REDIRECTOR=devbox.cern.ch # export EOS_VST_BROKER_URL=root://devbox.cern.ch:1099//eos/ # export EOS_VST_TRUSTED_HOST=devbox.cern.ch  609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 \end{verbatim} \subsection{Create a simple shared secret \texttt{keytab} file} In order to internally authenticate the \texttt{mgm} and \texttt{fst} nodes using the simple shared secret mechanism, create a simple shared secret \texttt{keytab} file. \begin{verbatim} xrdsssadmin -k eoscta -u daemon -g daemon add /etc/eos.keytab \end{verbatim} \subsection{Create a kerberos \texttt{keytab} file readable by the EOS \texttt{xrootd} daemons} Create a system \texttt{/etc/krb5.keytab} file if one does not already exist, for example install the \texttt{cern-get-keytab} rpm if the development box is at CERN and runs a CERN supported version of linux. \begin{verbatim} yum install cern-get-keytab \end{verbatim} In order for the EOS \texttt{mgm} to authenticate users using kerberos, create a  Eric Cano committed Oct 14, 2015 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 a new \texttt{eos} service principal in the \texttt{kdc}, and get the key installed in the keytab. This will also recreate new version of every other key for this host. The key the eos principal can then be extracted to a new keytab, which will be owned by user daemon so it becomes readable by the \texttt{mgm}. \begin{verbatim} [root@devbox ~]# cern-get-keytab --service eostest -f Waiting for password replication (0 seconds past) Waiting for password replication (5 seconds past) Waiting for password replication (10 seconds past) Keytab file saved: /etc/krb5.keytab [root@lxc2dev3d1 ~]# ktutil ktutil: rkt /etc/krb5.keytab ktutil: l slot KVNO Principal ---- ---- --------------------------------------------------------------------- 1 14 devbox$@CERN.CH 2 14 devbox$@CERN.CH 3 14 devbox$@CERN.CH 4 14 eos/devbox.cern.ch@CERN.CH 5 14 eos/devbox.cern.ch@CERN.CH 6 14 eos/devbox.cern.ch@CERN.CH ktutil: delent 1 ktutil: delent 1 ktutil: delent 1 ktutil: l slot KVNO Principal ---- ---- --------------------------------------------------------------------- 1 14 eos/devbox.cern.ch@CERN.CH 2 14 eos/devbox.cern.ch@CERN.CH 3 14 eos/devbox.cern.ch@CERN.CH ktutil: wkt /etc/krb5.keytab.eos ktutil: q [root@devbox ~]# chown daemon.daemon /etc/krb5.keytab.eos \end{verbatim} This operation will re-generate all the keys of the host. It might require the client users to \texttt{kdestroy} their corresponding tickets in caches.  665 666 667 668 669 670 671 672  \subsection{Setup the \texttt{/etc/xrd.cf.mgm} configuration file} Backup the original \texttt{/etc/xrd.cf.mgm} file installed by the \texttt{eos-server} rpm. \begin{verbatim} sudo cp /etc/xrd.cf.mgm /etc/xrd.cf.mgm_ORGINIAL \end{verbatim}  Daniele Kruse committed Jun 18, 2015 673   674 675 676 677 678 679 680 681 682 683 684 685 686 687 Disable the unix based authentication mechanism of xroot. \begin{verbatim} sudo sed -i 's/^sec.protocol unix.*/# &/' /etc/xrd.cf.mgm \end{verbatim} Disable the gsi based authentication mechanism of xroot. \begin{verbatim} sudo sed -i 's/^sec.protocol gsi.*/# &/' /etc/xrd.cf.mgm \end{verbatim} Configure the kerberos athentication mechanism of xroot to read the EOS specific kerberos \texttt{keytab} file. \begin{verbatim} sudo sed -i 's/^sec.protocol krb5.*/sec.protocol  Eric Cano committed Oct 16, 2015 688  krb5 \/etc\/krb5.keytab.eos eos\/@CERN.CH/' /etc/xrd.cf.mgm  689 \end{verbatim}  Daniele Kruse committed Jun 25, 2015 690   691 692 693 694 Set the order of authentication mechanisms to be used to kerberos followed by simple shared secret. \begin{verbatim} sudo sed -i 's/^sec.protbind.*/# &/' /etc/xrd.cf.mgm  Eric Cano committed Oct 16, 2015 695  sudo sed -i 's/^# sec.protbind \*.*/sec.protbind * only krb5 sss/'  696 \end{verbatim}  Daniele Kruse committed Jun 25, 2015 697   Steven Murray committed Oct 13, 2015 698 699 The protocol configuration lines in the newly created \texttt{xrd.cf.mgm} file should look something like the following.  Daniele Kruse committed Jun 25, 2015 700 \begin{verbatim}  Steven Murray committed Oct 13, 2015 701 702 #sec.protocol unix sec.protocol krb5 /etc/krb5.keytab.eos host/@CERN.CH  Daniele Kruse committed Oct 22, 2015 703 704 #sec.protocol gsi -crl:0 -cert:/etc/grid-security/daemon/hostcert.pem -key:/etc/grid-security/daemon/hostkey.pem -gridmap:/etc/grid-security/gri  Steven Murray committed Oct 13, 2015 705 706 sec.protbind * only krb5 sss mgmofs.broker root://devbox.cern.ch:1097//eos/  707 708 \end{verbatim}  709 710 711 712 713 714 715 716 717 Make sure the following entry exists so that the EOS namespace plugin will be loaded in. \begin{verbatim} #------------------------------------------------------------------------------- # Set the namespace plugin implementation #------------------------------------------------------------------------------- mgmofs.nslib /usr/lib64/libEosNsInMemory.so \end{verbatim}  718 719 720 721 722 723 724 725 726 \subsection{Setup the /etc/xrd.cf.fst configuration file} Backup the original \texttt{/etc/xrd.cf.fst} file installed by the \texttt{eos-server} rpm. \begin{verbatim} sudo cp /etc/xrd.cf.fst /etc/xrd.cf.fst_ORGINIAL \end{verbatim} Replace all of the hostnames with the fully qualified hostname of the local  Steven Murray committed Oct 13, 2015 727 728 729 730 development box. The hostname entries in the newly created \texttt{xrd.cd.fst} file should look something like the following, where \texttt{devbox.cern.ch} should be replaced with the fully qualified name of the development box where EOS is being installed.  731 \begin{verbatim}  Steven Murray committed Oct 13, 2015 732 733 all.manager devbox.cern.ch 2131 fstofs.broker root://devbox.cern.ch:1097//eos/  734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 \end{verbatim} \subsection{Set both the EOS \texttt{mgm} and the EOS \texttt{mq} to be masters} \begin{verbatim} sudo service eos master mgm sudo service eos master mq \end{verbatim} \subsection{Create a local directory to be used to store files by the EOS \texttt{fst}} \begin{verbatim} sudo mkdir -p /fst sudo chown daemon:daemon /fst/ \end{verbatim} \subsection{Start the xrootd daemons that will run the EOS \texttt{mgm}, \texttt{mq} and \texttt{fst} plugins} \begin{verbatim} sudo service eos start \end{verbatim}  Daniele Kruse committed Oct 22, 2015 753 \subsection{Enable the kerberos and simple shared secret authentication mechanisms within EOS as opposed to xroot}  754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 \begin{verbatim} sudo eos vid enable sss sudo eos vid enable krb5 \end{verbatim} \subsection{Register the local /fst directory with the default EOS space} \begin{verbatim} sudo EOS_MGM_URL="root://devbox.cern.ch" eosfstregister -r /fst default:1 \end{verbatim} \subsection{Put the EOS fst node on-line} \begin{verbatim} sudo eos node set devbox.cern.ch on \end{verbatim} \subsection{Enable the default EOS space} \begin{verbatim} sudo eos space set default on \end{verbatim} \subsection{Create the EOS namespace} Create the \texttt{/eos} directory within the EOS namespace, map it to the EOS \texttt{default} space and then set the number of replicas to 1. \begin{verbatim} sudo eos attr -r set default=replica /eos  Eric Cano committed Oct 16, 2015 779 780 781 782  sudo eos attr -r set sys.forced.nstripes=1 /eos \end{verbatim} \section{Compile CTA}  Steven Murray committed Aug 25, 2016 783 784 785 786 Make sure \texttt{yum} is configured to see the repositories that will provide the packages required to build CTA. If CTA is to be compiled on SLC6 then only the CTA command-line tool will be built and this will only require the following epel repository configuration:  Steven Murray committed Aug 25, 2016 787 788 789 790 791 792 793 \begin{verbatim} cat /etc/yum.repos.d/epel.repo [epel] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN baseurl=http://linuxsoft.cern.ch/epel/6/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1  Eric Cano committed Oct 16, 2015 794 enabled=1  Steven Murray committed Aug 25, 2016 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 protect=0 [epel-debug] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - debug RPMs baseurl=http://linuxsoft.cern.ch/epel/6/$basearch/debug/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 [epel-source] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - source RPMs baseurl=http://linuxsoft.cern.ch/epel/6/SRPMS/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 [epel-testing] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - testing baseurl=http://linuxsoft.cern.ch/epel/testing/6/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 [epel-testing-debug] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - testing debug RPMs baseurl=http://linuxsoft.cern.ch/epel/testing/6/$basearch/debug/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 [epel-testing-source] name=UNSUPPORTED: Extra Packages for Enterprise Linux add-ons, no formal support from CERN - testing source RPMs baseurl=http://linuxsoft.cern.ch/epel/testing/6/SRPMS/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 gpgcheck=1 enabled=0 protect=0 \end{verbatim}  Steven Murray committed Aug 25, 2016 838 839 If CTA is to be built on CC7 then it will require the following \texttt{epel} and \texttt{ceph} repository configurations:  Steven Murray committed Aug 25, 2016 840 841 842 843 844 \begin{verbatim} cat /etc/yum.repos.d/epel.repo [epel] name=Extra Packages for Enterprise Linux 7 -$basearch baseurl=http://linuxsoft.cern.ch/epel/7/$basearch  Eric Cano committed Oct 16, 2015 845 enabled=1  Steven Murray committed Aug 25, 2016 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 [epel-debuginfo] name=Extra Packages for Enterprise Linux 7 -$basearch - Debug baseurl=http://linuxsoft.cern.ch/epel/7/$basearch/debug enabled=0 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 gpgcheck=1 [epel-source] name=Extra Packages for Enterprise Linux 7 -$basearch - Source baseurl=http://linuxsoft.cern.ch/epel/7/SRPMS enabled=0 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 gpgcheck=1 \end{verbatim} \begin{verbatim} cat /etc/yum.repos.d/ceph-cc7.repo [cc7-ceph] name=cc7-ceph baseurl=http://linuxsoft.cern.ch/mirror/download.ceph.com/rpm-infernalis/el7/x86_64  Eric Cano committed Oct 16, 2015 869 gpgcheck=0  Steven Murray committed Aug 25, 2016 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 enabled=1 protect=0 priority=4 \end{verbatim} In addition you will need to make sure the \texttt{cernonly} repository is enabled: \begin{verbatim} egrep -A3 '$cernonly$' /etc/yum.repos.d/CentOS-CERN.repo [cernonly] name=CentOS-$releasever - CERN Only baseurl=http://linuxsoft.cern.ch/cern/centos/$releasever/cernonly/$basearch/ gpgcheck=1 \end{verbatim} Install \texttt{cmake} if it is not already installed: \begin{verbatim} sudo yum install cmake \end{verbatim} Obtain the URL of the CTA source-code repository by going to the following CERN gitlab web page: \begin{verbatim} https://gitlab.cern.ch/cta/CTA \end{verbatim} You should copy the URL from this web page, however at the time this document  Steven Murray committed Aug 25, 2016 895 was written the 3 possible URLS were:  Steven Murray committed Aug 25, 2016 896 897 898 899 900 901 \begin{verbatim} HTTPS: https://gitlab.cern.ch/cta/CTA.git KRB5: https://:@gitlab.cern.ch:8443/cta/CTA.git SSH: ssh://git@gitlab.cern.ch:7999/cta/CTA.git \end{verbatim}  Steven Murray committed Aug 25, 2016 902 Clone the \texttt{CTA} git repository (this example uses the KRB5 URL):  Steven Murray committed Aug 25, 2016 903 904 905 \begin{verbatim} git clone https://:@gitlab.cern.ch:8443/cta/CTA.git \end{verbatim}  Steven Murray committed Aug 25, 2016 906 This will create a directory called \texttt{CTA}.  Steven Murray committed Aug 25, 2016 907   Steven Murray committed Aug 25, 2016 908 909 Create a build directory at the same level as the \texttt{CTA} directory or anywhere accept inside of the \texttt{CTA} directory:  Steven Murray committed Aug 25, 2016 910 911 912 913 914 915 916 917 918 \begin{verbatim} mkdir CTA_build \end{verbatim} Enter the build directory and run the following \texttt{cmake} and \texttt{make} commands in order to produce the source RPM of the CTA project: \begin{verbatim} cd CTA_build cmake -DPackageOnly:Bool=true ../CTA  Steven Murray committed Aug 25, 2016 919 make cta_srpm  Steven Murray committed Aug 25, 2016 920 921 \end{verbatim}  Steven Murray committed Aug 25, 2016 922 923 Use \texttt{yum-builddep} to install all of the packages required to build the rest of CTA:  Steven Murray committed Aug 25, 2016 924 925 926 927 \begin{verbatim} sudo yum-builddep RPM/SRPMS/cta-0-0.src.rpm \end{verbatim}  Steven Murray committed Aug 25, 2016 928 Delete and then re-create the \texttt{CTA\_build} directory:  Steven Murray committed Aug 25, 2016 929 930 931 932 933 \begin{verbatim} cd .. rm -rf CTA_build mkdir CTA_build \end{verbatim}  Steven Murray committed Aug 25, 2016 934 Go into the newly re-created \texttt{CTA\_build} directory and run \texttt{cmake}  Steven Murray committed Aug 25, 2016 935 to prepare to make all of the CTA project and not just the source rpm:  Steven Murray committed Aug 25, 2016 936 937 938 939 940 \begin{verbatim} cd CTA_build cmake ../CTA \end{verbatim}  Steven Murray committed Aug 25, 2016 941 Build CTA by running \texttt{make} twice in the \texttt{CTA\_build} directory.  Steven Murray committed Aug 25, 2016 942 943 944 \begin{verbatim} make make  945 946 \end{verbatim}  Daniele Kruse committed Oct 12, 2015 947 948 949 950 951 952 953 954 955 956 957 \section{Create the mock nameserver base directory} We can do this by running the executable: \begin{verbatim}$ /nameserver/makeMockNameServerBasePath \end{verbatim} This command will return the newly created path to the mock nameserver base directory. Now we give it full permissions: \begin{verbatim} $chmod -R 0777 \end{verbatim} Now we need to add it as a configuration parameter in the \texttt{castor.conf}, as in the following example: \begin{verbatim}  Daniele Kruse committed Nov 02, 2015 958  TapeServer MockNameServerPath /tmp/CTAMockNS9r236q  Daniele Kruse committed Oct 12, 2015 959 960 \end{verbatim}  961 962 963 964 965 966 967 968 969 970 971 972 973 974 \section{Set up the objectstore VFS backend} First we create the new objectstore VFS backend using a simple executable: \begin{verbatim}$ /objectstore/makeMinimalVFS \end{verbatim} This command will return the newly created path to the VFS backend. Now we give it full permissions: \begin{verbatim} \$ chmod -R 0777 \end{verbatim} Now we need to add it as a configuration parameter in the \texttt{castor.conf}, as in the following example: \begin{verbatim} TapeServer ObjectStoreBackendPath /tmp/jobStoreVFSOKJCjW \end{verbatim}  Eric Cano committed Oct 16, 2015 975 \section{Setting up the environment}  Eric Cano committed Oct 19, 2015 976 \subsection{Virtual tape library}  Eric Cano committed Oct 16, 2015 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 The virtual tape library \texttt{mhvtl} will be used to simulate a tape drive. I can be installed like follows (from the same CASTOR{\_}SLC6 repository): \begin{verbatim} yum install mhvtl-utils kmod-mhvtl \end{verbatim} The configuration files for mhvtl are: \begin{itemize} \item \texttt{/etc/mhvtl/mhvtl.conf}: \item[{}] \begin{boxedverbatim} # Home directory for config file(s) MHVTL_CONFIG_PATH=/etc/mhvtl # Default media capacity (1 G) CAPACITY=1000 # Set default verbosity [0|1|2|3] VERBOSE=1 # Set kernel module debuging [0|1] VTL_DEBUG=0 \end{boxedverbatim} \item \texttt{/etc/mhvtl/device.conf}: