Commit 582af405 authored by Michael Davis's avatar Michael Davis
Browse files

[LHCb] Adds minutes of LHCb meetings

parent 4abb1e4c
......@@ -34,4 +34,4 @@ clobber: clean
.INTERMEDIATE: $(LATEX_TMP)
.PHONY: all show clean
.PHONY: all show clean clobber
TARGETS = atlas20180703.pdf cta20180820.pdf gfal.pdf lhcb20180704.pdf lhcb20180808.pdf lhcb20180817.pdf
LATEX_TMP = *.aux *.bbl *.blg *.log *.dvi *.bak *.lof *.log *.lol *.lot *.out *.toc
all: $(TARGETS)
%.pdf: %.tex
pdflatex $<
pdflatex $<
# Phony targets
clean:
rm -f $(LATEX_TMP)
clobber: clean
rm -f $(TARGETS)
.INTERMEDIATE: $(LATEX_TMP)
.PHONY: all clean clobber
\documentclass{lhcb+cta}
\renewcommand{\subheading}{Notes after CTA internal discussion on 20 August 2018}
\begin{document}
German: The key point missing here is that it is up to LHCb to transfer the file from EOSLHCB to EOSCTALHCB and check from there. This implies that LHCb also have to do space management on EOSLHCB - they can remove the file from disk once they have processed it and it is safely on tape on EOSCTALHCB.
CTA Meeting Minutes: Data workflows we would like to see:
LHCb will see two storage elements:
\begin{itemize}
\item A big EOS disk-only instance.
\item A small EOS/CTA instance behind the big EOS disk-only instance.
\end{itemize}
The big EOS disk-only instance will be responsible for receiving data from the experiment's DAQ system and for carrying out reprocessing, analysis and tier 1 data transfers. The small EOS/CTA instance will be dedicated to staging files between the big EOS instance and tape. The small instance will not for example be available to tier 1 GridFTP requests.
The expected lifecycle of a file going from DAQ to tape is:
\begin{enumerate}
\item Copy file from the DAQ to the big EOS disk-only instance.
\item Copy file from the big EOS disk-only instance to the small EOS/CTA instance.
\item Query the small EOS/CTA instance until the file is safely stored on tape.
\item Delete the file if necessary from the big EOS disk-only instance.
\end{enumerate}
\section*{Massimo's Notes}
This is a summary which might be useful to prepare the meeting with our LHCb colleagues.
\begin{itemize}
\item CTA is getting ready: preview stage
\begin{itemize}
\item No real migration before LS2 (2018 data still going to CASTOR)
\begin{itemize}
\item Preparation steps (at least understanding the workflows and the product) can be done now
\item ST offers a testbed to LHCb
\end{itemize}
\item Usecases
\begin{itemize}
\item We see 3 main actors: DAQ (Pit->CC), FTS (CC->T1) and Dirac (CASTOR->EOS, batch upload/download by workernodes)
\item Special case: Removal of RAW at the Pit (aka how to decide when it is safe to remove a copy on the DAQ side)
\end{itemize}
\end{itemize}
CTA will be SRM free and essentially a service in the CC (no gFTP)
\begin{itemize}
\item Protocols: xrootd (and in future also HTTP)
\item Implications for DIRAC?
\begin{itemize}
\item Other experiments using DIRAC (as ILC)
\end{itemize}
\end{itemize}
\item Expectations
\begin{itemize}
\item Describe what ATLAS and CMS are doing since 2017 and ALICE for the 2018 run
\item In this model EOSLHCb will be the receiving end for data transfers (from DAQ), be the source for export to LCG sites; feed/receive data from/to CTA
\end{itemize}
\item CTA can be interrogated to know if a file has been just received (disk cache) or the file is already on tape (cfr. m-bit in CASTOR)
\item CTA cache is small and optimised to match user activity and tape handling, that's why it is not directly exposed, for example, to long-running batch jobs
\item CTA is prepared to consider future use cases ("do more with tapes" mantra) but this will require discussions (mutual understanding) to put in place a good solution
\end{itemize}
My take is that we are all on the same page. I think it is important (as mentioned by everyone) that we speak with one voice and proposed a coherent view before proposing alternative scenarios. The fact that "ATLAS and CMS are happy with this model" will probably simplify the discussion.
I think it is important to verify if we should have also Nico Neufeld since (at least historically) he is in charge of the data removal at the pit.
\end{document}
\ No newline at end of file
\documentclass{lhcb+cta}
\renewcommand{\mainheading}{LHCb gfal Python code}
\renewcommand{\subheading}{Notes from Christophe Haen}
\begin{document}
\section*{List of gfal2 calls used by LHCb}
\begin{python}
bring_online
bring_online_poll
checksum
filecopy
getxattr
listdir
listxattr
mkdir_rec
release
rmdir
set_opt_boolean
stat
transfer_parameters
unlink
\end{python}
\section*{Protocol}
Changing the protocol from SRM to XRootD has no side-effects, providing that the protocol behaves the same. The URL is constructed in real time, using a configuration file. If we have to tell DIRAC that from now on, it should rely on the xroot protocol, it is one line in the configuration file for us. But of course, that is providing the protocol exposes the same functionalities, with the same interface.
See below the Python/gfal2 calls used by LHCb to stat a file, stage it, and monitor the staging status.
\section*{Stat a file}
\begin{python}
gfal2_stat.py
import gfal2
fileURL = 'srm://srm-lhcb.cern.ch:8443/srm/managerv2?SFN=/castor
/cern.ch/grid/lhcb/data/2018/RAW/FULL/LHCb/COLLISION18/209383
/209383_0000001095.raw'
ctx = gfal2.creat_context()
ctx.set_opt_boolean('BDII', 'ENABLE', False)
ctx.set_opt_boolean('GRIDFTP PLUGIN', 'SESSION_REUSE', False)
ctx.set_opt_boolean('GRIDFTP PLUGIN', 'IPV6', True)
ctx.set_opt_integer('SRM PLUGIN', 'OPERATION_TIMEOUT', 100)
ctx.set_opt_string('SRM PLUGIN', 'SPACETOKENDESC', 'LHCb-Tape')
ctx.set_opt_integer('SRM PLUGIN', 'REQUEST_LIFETIME', 3600)
ctx.set_opt_string_list('SRM PLUGIN', 'TURL_PROTOCOLS', ['gsiftp'])
print "stat"
print ctx.stat(fileURL)
print "checksum"
print ctx.checksum( fileURL, 'ADLER32')
print "status"
print ctx.getxattr( fileURL, 'user.status')
\end{python}
\subsection*{Result}
\begin{python}
stat
uid: 45
gid: 46
mode: 100644
size: 5242935904
nlink: 1
ino: 0
ctime: 1528143170
atime: 0
mtime: 1528143170
checksum
0e59c8c5
status
NEARLINE
\end{python}
\section*{Stage a file}
\begin{python}
gfal2_prestage.py
import gfal2
fileURL = 'srm://srm-lhcb.cern.ch:8443/srm/managerv2?SFN=/castor
/cern.ch/grid/lhcb/data/2018/RAW/FULL/LHCb/COLLISION18/209383
/209383_0000001095.raw'
lifetime = 86400
timeout = 43200
async = True
ctx = gfal2.creat_context()
ctx.set_opt_boolean('BDII', 'ENABLE', False)
ctx.set_opt_boolean('GRIDFTP PLUGIN', 'SESSION_REUSE', False)
ctx.set_opt_boolean('GRIDFTP PLUGIN', 'IPV6', True)
ctx.set_opt_integer('SRM PLUGIN', 'OPERATION_TIMEOUT', 100)
ctx.set_opt_string('SRM PLUGIN', 'SPACETOKENDESC', 'LHCb-Tape')
ctx.set_opt_integer('SRM PLUGIN', 'REQUEST_LIFETIME', 3600)
ctx.set_opt_string_list('SRM PLUGIN', 'TURL_PROTOCOLS', ['gsiftp'])
print ctx.bring_online(fileURL, lifetime, timeout, async)
\end{python}
\subsection*{Result}
\begin{python}
(0, '53277390')
\end{python}
\section*{Monitor the Staging Status}
\begin{python}
gfal2_stagingStatus.py
import gfal2
fileURL = 'srm://srm-lhcb.cern.ch:8443/srm/managerv2?SFN=/castor
/cern.ch/grid/lhcb/data/2018/RAW/FULL/LHCb/COLLISION18/209383
/209383_0000001095.raw'
# output from the bring_online command
token = '53277390'
ctx = gfal2.creat_context()
ctx.set_opt_boolean('BDII', 'ENABLE', False)
ctx.set_opt_boolean('GRIDFTP PLUGIN', 'SESSION_REUSE', False)
ctx.set_opt_boolean('GRIDFTP PLUGIN', 'IPV6', True)
ctx.set_opt_integer('SRM PLUGIN', 'OPERATION_TIMEOUT', 100)
ctx.set_opt_string('SRM PLUGIN', 'SPACETOKENDESC', 'LHCb-Tape')
ctx.set_opt_integer('SRM PLUGIN', 'REQUEST_LIFETIME', 3600)
ctx.set_opt_string_list('SRM PLUGIN', 'TURL_PROTOCOLS', ['gsiftp'])
print ctx.bring_online_poll(fileURL, token)
\end{python}
\subsection*{Result}
0 if not staged, 1 if staged
\end{document}
\ No newline at end of file
\NeedsTeXFormat{LaTeX2e}
\ProvidesClass{lhcb+cta}[2018/07/04 LaTeX class for minutes of LHCb+CTA discussions]
\LoadClass[11pt,a4paper]{article}
%%
%% Page size
%%
\RequirePackage[top=3cm, bottom=3cm, left=3cm, right=3cm]{geometry}
\RequirePackage{parskip}
%%
%% Colours and Fonts
%%
\RequirePackage{color}
% CERN blue is Pantone 286 = RGB 56 97 170, defined as cern@blue below
\definecolor{cern@ltblue}{rgb}{0.415686,0.611765,0.964706} % RGB 106 156 246
\definecolor{cern@blue} {rgb}{0.219608,0.380392,0.666667} % RGB 56 97 170
\definecolor{cern@dkblue}{rgb}{0.082353,0.184314,0.364706} % RGB 21 47 93
% Complimentary colours
\definecolor{cern@ltcomp}{rgb}{0.666667,0.525490,0.219608} % RGB 170 134 56
\definecolor{cern@dkcomp}{rgb}{0.364706,0.266667,0.047059} % RGB 93 68 12
% Set serif font to Paratype
\RequirePackage{paratype}
\RequirePackage[T1]{fontenc}
% Set section headings font/colour
\RequirePackage{sectsty}
%\setsansfont{AvenirLTStd-Book}
\sectionfont{\color{cern@blue}\sffamily}
\subsectionfont{\color{cern@blue}\sffamily}
% Set itemize bullet style
\newcommand{\textbulletsquare}{\textcolor{cern@ltcomp}{\raisebox{.25ex}{\rule{0.9ex}{0.9ex}}}}
\renewcommand{\labelitemi}{\textbulletsquare}
%%
%% Page Headers/Footers
%%
\RequirePackage{fancyhdr}
\fancypagestyle{plain}{%
\fancyhf{} % clear all header and footer fields
\fancyfoot[C]{\color{cern@dkblue}\sffamily\fontsize{9pt}{9pt}\selectfont\thepage} % except the center
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}}
\pagestyle{plain}
%%
%% Tables
%%
\RequirePackage{tabularx}
\newcommand{\colhead}[1]{\multicolumn{1}{c}{\color{cern@blue}\textsf{#1}}}
\renewcommand{\arraystretch}{1.5}
%%
%% Other packages and macros
%%
\RequirePackage{hyperref}
\hypersetup{
colorlinks=true, % false: boxed links / true: coloured links
linkcolor=cern@ltblue, % colour of internal links (change box color with linkbordercolor)
citecolor=cern@ltblue, % colour of bibliography links
filecolor=cern@ltblue, % colour of file links
urlcolor=cern@ltblue % colour of external links
}
%%
%% Triangle with an exclamation point
%%
\RequirePackage[utf8]{inputenc}
\RequirePackage{newunicodechar}
\newcommand\Alert{%
\makebox[1.4em][c]{%
\makebox[0pt][c]{\raisebox{.1em}{\small!}}%
\makebox[0pt][c]{\color{red}\Large$\bigtriangleup$}}}%
% Shortcuts to handle 1st, 2nd, 3rd, 4th
\RequirePackage{relsize}
\newcommand{\squared}{\ensuremath{^{\textsf{2}}} }
\newcommand{\superscript}[1]{\ensuremath{^{\textsf{\smaller #1}}}}
\newcommand{\subscript}[1]{\ensuremath{_{\textsf{\smaller #1}}}}
\newcommand{\superrm}[1]{\ensuremath{^{\textrm{\smaller #1}}}}
\newcommand{\st}[0]{\superscript{st} }
\newcommand{\nd}[0]{\superscript{nd} }
\newcommand{\rd}[0]{\superscript{rd} }
\renewcommand{\th}[0]{\superscript{th} }
\newcommand{\strm}[0]{\superrm{st} }
\newcommand{\ndrm}[0]{\superrm{nd} }
\newcommand{\rdrm}[0]{\superrm{rd} }
\newcommand{\thrm}[0]{\superrm{th} }
% Header on first page
\newcommand{\mainheading}{CTA+LHCb Discussion}
\newcommand{\subheading}{Minutes of the meeting of \thisminutesdate}
\AtBeginDocument{
{\sffamily\Huge\color{cern@dkblue} \mainheading}\\
{\color{cern@dkblue}\rule{\textwidth}{1pt}
\color{cern@ltblue}\sffamily\Large\subheading}\\
}
% Definitions and environments for standard sections
\newcommand{\present}[2]{%
\section*{Present}
\begin{itemize}
\item \textbf{IT:} #1
\item \textbf{LHCb:} #2
\end{itemize}}
\newcommand{\agenda}[1][%
\item Review of the previous minutes
\item Review of the previous action list
\item News from IT
\item News from LHCb
\item AOB
\item Date of the next meeting]{%
\section*{Agenda}
\begin{enumerate}
#1
\end{enumerate}}
\newcommand{\previousminutes}[1][Approved.]{%
\section*{Review of the Previous Minutes}
#1}
\newcommand{\aob}[1][There was no other business.]{%
\section*{AOB}
#1
}
\newcommand{\nextmeeting}[1][09h30 on Wednesday \nextminutesdate, 2--R--030]{
\section*{Next Meeting}
#1
}
\newenvironment{actionlist}
{\begingroup\tabularx{\textwidth}{clXll}
\colhead{\#} & \colhead{Who} & \colhead{What} & \colhead{Added} & \colhead{Status}\\
\hline}{\endtabularx\endgroup}
% Default fixed font does not support bold face
\DeclareFixedFont{\ttb}{T1}{txtt}{bx}{n}{12} % for bold
\DeclareFixedFont{\ttm}{T1}{txtt}{m}{n}{12} % for normal
% Custom colors
\definecolor{deepblue}{rgb}{0,0,0.5}
\definecolor{deepred}{rgb}{0.6,0,0}
\definecolor{deepgreen}{rgb}{0,0.5,0}
\RequirePackage{listings}
% Python style for highlighting
\newcommand\pythonstyle{\lstset{
language=Python,
basicstyle=\ttm,
otherkeywords={self}, % Add keywords here
keywordstyle=\ttb\color{deepblue},
emph={MyClass,__init__}, % Custom highlighting
emphstyle=\ttb\color{deepred}, % Custom highlighting style
stringstyle=\color{deepgreen},
frame=tb, % Any extra options here
showstringspaces=false %
}}
% Python environment
\lstnewenvironment{python}[1][]
{
\pythonstyle
\lstset{#1}
}
{}
% Python for external files
\newcommand\pythonexternal[2][]{{
\pythonstyle
\lstinputlisting[#1]{#2}}}
% Python for inline
\newcommand\pythoninline[1]{{\pythonstyle\lstinline!#1!}}
\ No newline at end of file
\documentclass{lhcb+cta}
\newcommand{\thisminutesdate}{4 July 2018}
\begin{document}
\present{Michael Davis, Julien Leduc}{Joël Closier, Christophe Haen}
\section*{Protocols}
One of the key points from the discussion is that LHCb are staging files directly from CASTOR to other Storage Endpoints (SEs) across the grid. For this use case they need to support all the protocols in use at all Tier-1 sites, not just the protocols in use at Tier-0. Some sites use dCache, which does not support 3\rd Party Copy (TPC). Another system in use at some sites is StoRM. (And DPM? Not mentioned in the discussion).
For EOS disk access, they use only XRootD for reading and writing but for CASTOR tape they use only SRM, for the reason mentioned above: their software needs to write to SEs which don't support XRootD, and they want to use the same protocol everywhere.
The main library/abstraction layer that they use to query stagers \textit{etc.} is \texttt{gfal2}. They use the Python bindings to the library.
They have also started to add support for ECHO (Ceph-based disk storage at RAL). RAL prefer to use \texttt{GSIFTP} with an XRootD plugin written by Sebastien Ponce. \texttt{GSIFTP} is a subset of the \texttt{GridFTP} protocol, essentially standard FTP enhanced to use GSI security. It does not include many of the high-performance \texttt{GridFTP} protocol features, such as parallel data transfer, automatic TCP window/buffer sizing, enhanced reliability, \textit{etc.} LHCb will need to be able to write to RAL using TPC.
\Alert{Find out from Giuseppe how it currently works when staging from CASTOR to ECHO disk at RAL.}
From the pit, they write directly to CASTOR and data is immediately staged for writing to tape.
TPC must be possible to all SEs not just T0. This will be made possible using XRootD v5 which is due to be released in the autumn.
If we don't want to allow files to be staged directly from CTA to Tier-1 SEs, this will require a change in LHCb workflows, so we will need to discuss this with them.
\section*{Space Token}
The Space Token is a SRM concept. Many sites only work with the Space Token so are bound to SRM.
The Space Token is also used for accounting. However, on CASTOR space accounting gives the space used on the disk stager, not the amount of data archived to tape, so this is not so useful.
\Alert{Is this the same thing as "JSON file with space" that ATLAS were talking about?}
\section*{Permissions}
The typical use case is production staging. User staging is very rare. However, in some cases users need permissions to access tape (batch retreives).
In CASTOR, only the Production role in LHCb VOMS has the rights to write to tape, but some users without that role need to have permission to read. One common use case is that their calibration guys need to recall a single file. They are a special group who have this specific requirement. They don't need write access.
LHCb say that they don't do the thing with multiple requests for the same data and then cancelling them as soon as one is fulfilled. If they request to stage something, that means they really want it.
Permissions need to be consistent across all WLCG SEs. Most users are not CERN users.
There was a question about how grid certificates will be mapped to privileged users. Will this be under the control of the experiment or will they have to ask us to do it?
\section*{Testing}
LHCb are happy to let ATLAS do the load testing. If CTA can handle ATLAS loads, then it will not have a problem with LHCb loads.
A good test for them would be to stage 100,000 files from tape and distribute them to many sites across the grid.
\section*{Additional Notes from Christophe 17/07/2018}
There are two kinds of jobs: production jobs and user jobs.
\subsection*{Production Jobs}
These are run by the experiment's data management experts, who control what jobs run and when they run.
The normal use case is pre-staging the data: copy all the needed data to a disk storage (T0 EOS or another T1 SE). Apart from some exceptional circumstances, production jobs do not access tape storage. The output is always written to disk. The transfer to a tape storage is done asynchronously using FTS.
\subsection*{User Jobs}
Users are allowed to run jobs on files in CASTOR (with some restrictions). Depending on the computing model that will be adopted for Run3, this might increase. At the moment it is ``very low''.
When a user job requests a file on tape, before starting the job, the Python/gfal2 programs are used to centrally stage the required files. Once the file is in the CASTOR disk cache, the job is woken up and sent to a site, which will try to read the file from the disk frontend.
Output of user jobs are never written to tape.
\end{document}
\ No newline at end of file
\documentclass{lhcb+cta}
\newcommand{\thisminutesdate}{8 August 2018}
\begin{document}
This meeting was to discuss data transfers between the LHCb pit and T0. Niko and Tommaso are are the team responsible for this, including any software development. For transfers between T0 and T1 our contacts are Christophe and Jo\"{e}l.
\present{Michael Davis}{Niko Neufeld, Tommaso Columbo}
\section*{Workflow and Use Cases}
Data sent from the LHCb pit to CASTOR is transferred using Dirac and the XRoot protocol. Dirac uses a pull workflow, unlike Rucio which is based on a push workflow.
Monte Carlo data is generated directly in the filters during periods of no data taking, so from our perspective it is exactly the same as raw data. The only difference is that the rate at which it is generated is a lot less.
Besides writing raw data, they have a second use case where they sometimes bring back data from tape for reconstruction.
When raw data is sent to CASTOR, LHCb actively checks (a) that the file has been safely archived to tape, and (b) the checksum, before deleting the file from the pit.
They have been toying with the idea of having a separate EOS instance for the raw data. If they decide to go this way, then the LHCb pit will appear like a T1 to us.
\section*{The Way Forward}
Our proposal is that LHCb will send raw data to EOSLHCB and not EOSCTALHCB. They are fine with this change, but would like us to provide an API so that they can confirm when the file is safely on tape. If they have to poll EOSLHCB to get this information that's fine.
Perhaps the ``file on tape'' state will be provided as an XRoot extended attribute? If so it would require an update to an as yet-unreleased version of XRoot. This is also OK from their side but will require some coordination with the Dirac developers.
We need to have the same conversation around the workflow between T0 and T1. I spoke to Jo\"{e}l and will schedule another meeting with him and Christophe after Julien is back.
When we are ready, we should go back to them with our API proposal and work out a timeline for setting up a test instance.
\end{document}
\ No newline at end of file
\documentclass{lhcb+cta}
\newcommand{\thisminutesdate}{17 August 2018}
\begin{document}
\present{Michael Davis, Julien Leduc}{Joël Closier, Christophe Haen}
We discussed the LHCb workflows in more detail, in the context of the proposed EOSLHCb/EOSCTALHCb configuration. LHCb understand that they will need to change their workflows and this will require some development effort on their part. For the most common workflows there is no fundamental problem. However, there are some use cases which will require some creative thinking on our part.
\section*{Differences between LHCb and ATLAS}
CH pointed out some of the differences between ATLAS and LHCb workflows. ATLAS focuses on maximum availablity of data, whereas LHCb focuses on most efficient use of resources. This philosophical difference leads to several differences in practice:
\begin{itemize}
\item LHCb keeps fewer disk copies across the grid. ATLAS may copy a file a dozen times to different SEs, while LHCb has one or two. So the impact of transient errors (site down, disk pool unavailable) as well as permanent errors (corrupted disk file) is much higher for LHCb, requiring them to get the file from tape in these cases.
\item ATLAS sites will submit many requests for a file to different SEs (including T0), and then when the file arrives they cancel the other requests. LHCb do not do this. If they request a file from T0, it means they really need it.
\end{itemize}
\section*{Latency of writing raw data to tape}
Currently data is written directly from the LHCb pit to CASTOR. In our proposed CTA setup there will be one extra hop: first to the big EOS instance, then to the EOSCTA instance. CH is concerned that the additional copy will add latency to the time taken to archive files, perhaps requiring them to increase the size of the disk cache at the pit.
We should measure this in order to determine if this is really a problem. LHCb CASTOR logs will give us the current latencies. We could use the ATLAS CTA test instance to measure how much difference the extra intermediate copy makes.
\section*{Data Taking Workflow}
The main issues are:
\begin{itemize}
\item Concerns about the latency from the extra hop as mentioned above.
\item They do not want to open the files up to the grid until they are reported safely on tape. This is because they want the checksum of the disk file and tape file to have been validated before production jobs and users start to process the file, to guard against data corruption.
\end{itemize}
\section*{Data colocation}
JC asked if CTA will have the same feature as CASTOR where data is allocated to a tape family. They are concerned that raw data and data from production jobs and user jobs should be kept on physically separate storage and that they should have some control over what goes together. (As I understand, we will offer exactly the same functionality as CASTOR via Storage Classes and Tape Pools, just the terminology used by the experiments is a little different).
Specifically they are concerned about the colocation of data and reducing the number of tape mounts and therefore the latency in order to retrieve a complete dataset. I said we are aware of this problem but do not have a definite solution at the moment. I mentioned that we have a Ph.D. student starting in October to look at how we can optimise access to the tape storage, including data colocation.
\section*{Jobs which run on CASTOR}
LHCb have some use cases for running jobs on CASTOR which need to be addressed in our CTA setup:
\begin{itemize}
\item Calibration team: their workflow requires them to recall a single file from tape and read it once. This is done infrequently, but they need to be able to recall the files quickly (< 1 hour). These jobs are run on CASTOR as it is a waste of time and resources to copy it to the main disk instance in order to read it once then discard it.
\item Last option fallback for user jobs: normally users access a disk copy of their files, but in some cases the file may be temporarily unavailable. When this happens they would like to have the option of running the user job on the stager instance, until the disk copies become available again. At the moment, this use case is very rare, but depending on the computing model that will be adopted for Run3, it might increase.
\end{itemize}
These use cases require read access to tape only. Only the Production role in LHCb VOMS can write to tape.
\section*{File Transfer Protocols}
They use FTS with \texttt{gfal2}/SRM underneath for all staging out of CASTOR. FTS is used everywhere except export from the pit and one-shot jobs which are run on the CASTOR instance as mentioned above.
Ideally they will be able to modify their \texttt{gfal2} scripts to change the protocol to XRootD and it will ``just work''. One issue is that XRootD 3rd Party Copy is not supported in all cases (e.g. transfers to dCache) so this could result in an extra local copy and additional network traffic.
There was a question about grid transfers to RAL/ECHO which only supports \texttt{GSIFTP} (Currently done using Sebastien Ponce's XRootD plugin for CASTOR?)
There was a question about how grid certificates will be mapped to privileged users. Will this be under the control of the experiment or will they have to ask us to do it?
\end{document}
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment