- Nov 28, 2022
-
-
Joao Afonso authored
New states REPACKING/EXPORTED, new internal states, new maintenance runner for cleaning-up retrieve queue requests
-
- Apr 29, 2022
-
-
Jorge Camarero Vera authored
-
- Mar 28, 2022
-
-
Jorge Camarero Vera authored
-
- Nov 12, 2021
-
-
Jorge Camarero Vera authored
-
- Nov 09, 2021
-
-
mvelosob authored
-
- Nov 08, 2021
-
-
Jorge Camarero Vera authored
-
- Oct 12, 2021
-
-
mvelosob authored
During the July datachallenge the archiveJobTransferForUser queue for the r_atlas_test_datachallenge tapepool became full of ArchiveJobs whose bytes field became zero after popped from the queue. This caused the tape servers to pop ~5TB of work from the queue. To prevent this from happening in the future, instead of summing the sizes of the individual elements popped from the queue, we now subtract them from the total size popped. This way, if there are popped jobs that have incorrectly set their bytes field to zero, the algorithm will consume less data than expected, not exponencially more
-
- Aug 02, 2021
-
-
Jorge Camarero Vera authored
-
- Jun 15, 2021
-
-
Jorge Camarero Vera authored
-
- Jun 02, 2021
-
-
Jorge Camarero Vera authored
-
- Nov 22, 2019
-
-
Cedric CAFFY authored
Retry queueing in ArchiveQueueToReportToRepackForSuccess if switchElementsOwnership fails because of a rados::lockbackoff() problem
-
- Jul 02, 2019
-
-
Cedric CAFFY authored
-
- Apr 15, 2019
-
-
Eric Cano authored
-
- Feb 22, 2019
-
-
Cedric CAFFY authored
Queueing of Archive Jobs is done and unit tested Queueing of Retrieve Requests is not completely done yet
-
- Jan 25, 2019
-
-
Cedric CAFFY authored
Changed the status of succeeded retrieve requests as RJS_Succeeded and inserted them to the RetrieveQueueToReportToRepackForSuccess, unit tested but memory leak
-
- Dec 20, 2018
-
-
Cedric CAFFY authored
-
- Dec 13, 2018
- Dec 10, 2018
-
-
Eric Cano authored
This promotion is controlled so that only a limited number a requests are in the state ToExpand or Starting at any point in time. This ensures both the availabality of repack file requests to system while preventing an explosion of file level requests. Created a one-round popping from the container (algorithms) with status switching. - Used for repack requests switching from pendig to to expand Added ElementStatus to algorithms. Implemented promotion interface in Scheduler and OstoreDb. The actual decision is taken at the Scheduler level. The function itself is called by the RepackRequestManager. Promotion is tested in a unit test. Various code maintenance: Switched to "using"-based constructor inheritance. Fixed privacy of function in cta::range.
-
- Oct 01, 2018
- Sep 07, 2018
-
-
Michael Davis authored
-
- Sep 05, 2018
-
-
Michael Davis authored
-
- Sep 03, 2018
-
-
Michael Davis authored
No inheritance, instead have partial or full specialization based on two template parameters.
-
Michael Davis authored
-
- Aug 30, 2018
-
-
Eric Cano authored
Changed the lifecycle of the ArchiveRequest to handle the various combinations of several jobs and their respective success/failures. Most notably, the request now holds a reportdecided boolan, which is set when decing to report. This happens when failing to archive one copy (first failure), or when all copies are transferred (success for all copies). Added support for in-mount retries. On falure, the job will be requeued (with a chance to pick it up again) in the same session if sane session retries are not exceeded. Otherwise, the job is left owned by the session, to be picked up by the garbage collector at tape unmount. Made disk reporter generic, dealing with both success and failure. Improved mount policy support fir queueing. Expanded information avaible in popped element from archive queues. Added optional parameters to ArchiveRequest::asyncUpdateJobOwner() to cover various cases. Updated the archive job statuses. Clarified naming of functions (transfer/report failure instead of bare \"failure\"). Updated garbage collector for new archive job statuses. Added support for report retries and batch reporting in the scheduler database. Updated obsolete wording in MigrationReportPacker log messages and error counts.
-
Eric Cano authored
-
Eric Cano authored
"ToTransfer" are to be picked up by tape sessions. "ToReport" Includes both successes and failures to report, as the mechanism to report is the same. They will be handled by the reporter, which shares the single thread of the garbage collector. "Failed" Will be a (possibly non-queue) container which will contain the failed requests. The operators will be able to examine, relaunch or abandon those requests. The states and lifecycles of the requests have been reworked to reflect this lifecycle too. The container algorithmes have been adapted to handle the multiple queue/container types.
-
Eric Cano authored
-
Eric Cano authored
-
- Aug 17, 2018
-
-
Michael Davis authored
-
- Aug 10, 2018
-
-
Michael Davis authored
-
Michael Davis authored
-
Michael Davis authored
-
Michael Davis authored
-
Michael Davis authored
-
Michael Davis authored
-
Michael Davis authored
-
Michael Davis authored
-
Eric Cano authored
-