-
Notifications
You must be signed in to change notification settings - Fork 31
Creating your own Rucio Oracle DB
If you want to experiment with your own instance of Rucio Oracle database, you can use general purpose database service provided by CERN IT-DB:
https://cern.service-now.com/service-portal/service-element.do?name=general-purpose-db
Here is a direct link to the instructions how to request the account (need to be approved by DBA):
https://cern.service-now.com/service-portal/article.do?n=KB0000829
You will likely want either DEVDB11 or DEVDB12 database.
Once you get the account and reset the password, you can put them in the corresponding sections of your rucio.cfg ( [database] ) and alembic.ini ([bootstrap]) on your rucio server and then run
/usr/rucio/tools/bootstrap.py to create both root account and schema,
/usr/rucio/tools/reset_database.py to initialize or reset the schema.
connect to @DEVDB12 and issue the SQL command:
"select * from user_ts_quotas;"
If you do need more space, contact oracle.support@cern.ch to request an increase of your quota specifying the account, database (DEVDB12) and tablespace concerned with an explanation and estimate of your usage for the next one to three years.
login to https://session-manager.web.cern.ch/ with your account/password and database
ATLAS database monitoring provides similar information in one page : https://atlas-service-dbmonitor.web.cern.ch/atlas-service-dbmonitor/dashboard/show_sessions.php?user=ATLAS_RUCIO_W&db=ADCR
In addition to web pages we also run a client-side sessions monitoring via the following SQL query:
set pagesize 0; SET LINESIZE 140; SET TRIMSPOOL ON; set echo on; select g.*, sum("count(*)") over ( order by "count(*)" desc,username,osuser,status,machine,client_info ) "Cumul" from ( select count(*) "count(*)", username,osuser,status,machine,client_info from gv$session where username='PUT_HERE_THE_SCHEMA_OWNER_ACCOUNT' group by username,osuser,status,machine,client_info ) g order by "count(*)" desc,username,osuser,status,machine,client_info; quit;
Note, that access to the system tables and views is required. This check can run no pod inside the kubernetes cluster and send the results directly to CMS monit system.
from Martin:
Devdb keeps automatic backups of the operations, thus when you initialize/delete the DB a lot (such in tests) the space gets full very quickly.
I added a script https://github.com/rucio/rucio/blob/master/tools/purge_bin.py which purges the entire database including the backups. If you do these re-installs a lot I can recommend using this script :-)
In ericvaandering's fork of rucio, there is a branch cms_schema for keeping the schema for CMS (which differs a bit from ATLAS) as well as keeping the CMS needed jobs and procedures.
On top of the Oracles database schema, Rucio defines triggers and stored procedures for automated execution of certain tasks. The code was originally written by Gancho Dimitrov for Atlas. See his talk at 2nd Rucio Community WS.
In CMS Rucio instance we define a subset of generic triggers, some of which are temporarily disabled to allow the reuse of the dids during PhEDEx-to-Rucio catalog synchronization.
List the triggers and their status in cms_rucio_dev_admin.int2r:
SQL> select trigger_name, status from user_triggers; TRIGGER_NAME STATUS ------------------------------ -------- ACCOUNT_AVOID_UPDATE_DELETE ENABLED CHECK_DID_UNIQUENESS DISABLED MIGRATE_DELETED_DID DISABLED SCOPE_AVOID_UPDATE_DELETE ENABLED
To change trigger status use sql commands:
ALTER TRIGGER <trigger_name> DISABLE; ALTER TRIGGER <trigger_name> ENABLE;
SQL> set pages 0 SQL> set linesize 2200 SQL> select OBJECT_NAME, OBJECT_TYPE from user_procedures where OBJECT_TYPE='PROCEDURE'; COLLECTION_REPLICAS_UPDATES SQL> select text from user_source where name='COLLECTION_REPLICAS_UPDATES' order by line; PROCEDURE "COLLECTION_REPLICAS_UPDATES" AS type array_raw is table of RAW(16) index by binary_integer; type array_scope is table of VARCHAR2(30) index by binary_integer; type array_name is table of VARCHAR2(255) index by binary_integer; ids array_raw; rse_ids array_raw; scopes array_scope; names array_name; ds_length NUMBER(19); ds_bytes NUMBER(19); available_replicas NUMBER(19); old_available_replicas NUMBER(19); ds_available_bytes NUMBER(19); ds_replica_state VARCHAR2(1); row_exists NUMBER; CURSOR get_upd_col_rep IS SELECT id, scope, name, rse_id FROM CMS_RUCIO_PROD.updated_col_rep; BEGIN -- Delete duplicates DELETE FROM CMS_RUCIO_PROD.UPDATED_COL_REP A WHERE A.rowid > ANY (SELECT B.rowid FROM CMS_RUCIO_PROD.UPDATED_COL_REP B WHERE A.scope = B.scope AND A.name=B.name AND A.did_type=B.did_type AND (A.rse_id=B.rse_id OR (A.rse_id IS NULL and B.rse_id IS NULL))); -- Delete Update requests which do not have Collection_replicas DELETE FROM CMS_RUCIO_PROD.UPDATED_COL_REP A WHERE A.rse_id IS NOT NULL AND NOT EXISTS(SELECT * FROM CMS_RUCIO_PROD.COLLECTION_REPLICAS B WHERE B.scope = A.scope AND B.name = A.name AND B.rse_id = A.rse_id); DELETE FROM CMS_RUCIO_PROD.UPDATED_COL_REP A WHERE A.rse_id IS NULL AND NOT EXISTS(SELECT * FROM CMS_RUCIO_PROD.COLLECTION_REPLICAS B WHERE B.scope = A.scope AND B.name = A.name); COMMIT; OPEN get_upd_col_rep; LOOP FETCH get_upd_col_rep BULK COLLECT INTO ids, scopes, names, rse_ids LIMIT 5000; FOR i IN 1 .. rse_ids.count LOOP DELETE FROM CMS_RUCIO_PROD.updated_col_rep WHERE id = ids(i); IF rse_ids(i) IS NOT NULL THEN -- Check one specific DATASET_REPLICA BEGIN SELECT length, bytes, available_replicas_cnt INTO ds_length, ds_bytes, old_available_replicas FROM CMS_RUCIO_PROD.collection_replicas WHERE scope=scopes(i) and name=names(i) and rse_id=rse_ids(i); EXCEPTION WHEN NO_DATA_FOUND THEN CONTINUE; END; SELECT count(*), sum(r.bytes) INTO available_replicas, ds_available_bytes FROM CMS_RUCIO_PROD.replicas r, CMS_RUCIO_PROD.contents c WHERE r.scope = c.child_scope and r.name = c.child_name and c.scope = scopes(i) and c.name = names(i) and r.state='A' and r.rse_id=rse_ids(i); IF available_replicas >= ds_length THEN ds_replica_state := 'A'; ELSE ds_replica_state := 'U'; END IF; IF old_available_replicas > 0 AND available_replicas = 0 THEN DELETE FROM CMS_RUCIO_PROD.COLLECTION_REPLICAS WHERE scope = scopes(i) and name = names(i) and rse_id = rse_ids(i); ELSE UPDATE CMS_RUCIO_PROD.COLLECTION_REPLICAS SET state=ds_replica_state, available_replicas_cnt=available_replicas, length=ds_length, bytes=ds_bytes, available_bytes=ds_available_bytes, updated_at=sys_extract_utc(systimestamp) WHERE scope = scopes(i) and name = names(i) and rse_id = rse_ids(i); END IF; ELSE -- Check all DATASET_REPLICAS of this DS SELECT count(*), SUM(bytes) INTO ds_length, ds_bytes FROM CMS_RUCIO_PROD.contents WHERE scope=scopes(i) and name=names(i); UPDATE CMS_RUCIO_PROD.COLLECTION_REPLICAS SET length=nvl(ds_length,0), bytes=nvl(ds_bytes,0) WHERE scope = scopes(i) and name = names(i); FOR rse IN (SELECT rse_id, count(*) as available_replicas, sum(r.bytes) as ds_available_bytes FROM CMS_RUCIO_PROD.replicas r, CMS_RUCIO_PROD.contents c WHERE r.scope = c.child_scope and r.name = c.child_name and c.scope = scopes(i) and c.name = names(i) and r.state='A' GROUP BY rse_id) LOOP IF rse.available_replicas >= ds_length THEN ds_replica_state := 'A'; ELSE ds_replica_state := 'U'; END IF; UPDATE CMS_RUCIO_PROD.COLLECTION_REPLICAS SET state=ds_replica_state, available_replicas_cnt=rse.available_replicas, available_bytes=rse.ds_available_bytes, updated_at=sys_extract_utc(systimestamp) WHERE scope = scopes(i) and name = names(i) and rse_id = rse.rse_id; END LOOP; END IF; COMMIT; END LOOP; EXIT WHEN get_upd_col_rep%NOTFOUND; END LOOP; CLOSE get_upd_col_rep; COMMIT; END; 78 rows selected. SQL>
SQL> set time on 23:05:31 SQL> set timing on 23:05:36 SQL> exec COLLECTION_REPLICAS_UPDATES; PL/SQL procedure successfully completed. Elapsed: 00:20:30.03 23:26:24 SQL>
Example below drops and creates a scheduler job for a stored procedure.
SQL> exec dbms_scheduler.drop_job('COLLECTION_REPLICAS_UPDATES_JB'); BEGIN dbms_scheduler.create_job ( 'COLLECTION_REPLICAS_UPDATES_JB', job_type=>'STORED_PROCEDURE', job_action=> 'COLLECTION_REPLICAS_UPDATES', number_of_arguments=>0, start_date=>TO_TIMESTAMP_TZ('14-MAY-2019 23.00.00 EUROPE/ZURICH','DD-MON-YYYY HH24:MI:SS TZR'), repeat_interval=> 'FREQ=Minutely; INTERVAL=2', job_class=>'RUCIO_JOB_CLASS', enabled=> TRUE, auto_drop=> FALSE, comments=>'Every two minutes remove the duplicates from the UPDATED_COL_REP table for all scopes and update the COLLECTION_REPLICAS data' ); END; /BEGIN dbms_scheduler.drop_job('COLLECTION_REPLICAS_UPDATES_JB'); END; * ERROR at line 1: ORA-27475: "CMS_RUCIO_DEV_ADMIN.COLLECTION_REPLICAS_UPDATES_JB" must be a job ORA-06512: at "SYS.DBMS_ISCHED", line 224 ORA-06512: at "SYS.DBMS_SCHEDULER", line 657 ORA-06512: at line 1 SQL> SQL> 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 PL/SQL procedure successfully completed. SQL>
We get errors when trying to drop a non-existing job.
USER_SCHEDULER_JOBS and USER_SCHEDULER_JOB_RUN_DETAILS are useful to list jobs and check their details, e.g.:
23:03:39 SQL> select JOB_NAME, JOB_ACTION, LAST_RUN_DURATION from user_scheduler_jobs; JOB_NAME ------------------------------ JOB_ACTION -------------------------------------------------------------------------------- LAST_RUN_DURATION --------------------------------------------------------------------------- COLLECTION_REPLICAS_UPDATES_JB COLLECTION_REPLICAS_UPDATES 23:03:59 SQL>
These are created with rucio/etc/sql/oracle/jobs.sql
SQL> select JOB_NAME, LAST_RUN_DURATION,LAST_START_DATE,NEXT_RUN_DATE, STATE from user_scheduler_jobs; JOB_NAME LAST_RUN_DURATION LAST_START_DATE NEXT_RUN_DATE STATE ------------------------------ ------------------- --------------------------------------------- --------------------------------------------- --------------- COLLECTION_REPLICAS_UPDATES_JB +00 00:00:00.094285 07-NOV-19 09.00.00.200351000 PM EUROPE/ZURICH 07-NOV-19 09.30.00.000000000 PM EUROPE/ZURICH SCHEDULED COLL_REPL_UPDATED_JOB_CMS 07-NOV-19 09.02.00.000000000 PM EUROPE/ZURICH SCHEDULED RUCIO_ACCOUNT_USAGE_HIST_JOB 08-NOV-19 08.00.00.700000000 AM EUROPE/ZURICH SCHEDULED RUCIO_DATA_SLIDING_WINDOWS 11-NOV-19 10.00.00.200000000 AM EUROPE/ZURICH SCHEDULED RULES_HIST_SL_WINDOW 11-NOV-19 07.00.00.700000000 AM EUROPE/ZURICH SCHEDULED UPDATE_RSE_USAGE_HISTORY +00 00:00:05.627834 07-NOV-19 09.00.00.225689000 PM EUROPE/ZURICH 07-NOV-19 09.30.00.000000000 PM EUROPE/ZURICH SCHEDULED
- Oracle 11g documentation on triggers, procedures, and jobs:
- https://docs.oracle.com/cd/B28359_01/server.111/b28318/triggers.htm#CNCPT017
- https://docs.oracle.com/cd/B28359_01/appdev.111/b28843/tdddg_procedures.htm
- https://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_sched.htm#i1000363 and more on permissions:
- https://www.opencodez.com/oracle/oracle-job-scheduler-guide-examples-part-1.htm