背景:

元数据需要定期维护,否则会导致查询元数据信息缓慢。

清理脚本:

#!/bin/bash

DBNAME=youdata
source /home/gpadmin/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpadmin/master_data2/gpseg-1
export PGBOUNCER_PORT=6432

# 开始时间
t=$(date "+%Y-%m-%d %H:%M:%S")
echo "$t: Starting to invoke database maintenance"

# main job
{
    psql -p $PGBOUNCER_PORT -U pgbouncer -d pgbouncer -c 'pause $DBNAME'
    echo "pause db connection done"
    SYSTABLES="' pg_catalog.' || relname || ';' from pg_class a, pg_namespace b \
    where a.relnamespace=b.oid and b.nspname='pg_catalog' and a.relkind='r'"
    psql -tc "SELECT 'VACUUM FULL' || $SYSTABLES" $DBNAME | psql -a $DBNAME
    analyzedb -s pg_catalog -d $DBNAME -a
    endT=$(date "+%Y-%m-%d %H:%M:%S")
    echo "$endT: Maintaining database Done"
} &

# 获取后台进程ID
main_pid=$!
{
   // check every 10 seconds
    while kill -0 $main_pid 2> /dev/null; do
        echo "$(date "+%Y-%m-%d %H:%M:%S"): Checking main script status..."
        sleep 10  
    done

    # 
    endT=$(date "+%Y-%m-%d %H:%M:%S")
    echo "$endT: Main script completed. Performing cleanup."
     psql -p $PGBOUNCER_PORT -U pgbouncer -d pgbouncer -c 'resume $DBNAME'
    echo "Cleanup done."
} &

# 等待主脚本完成
wait $main_pid

将上面的内容复制到catalog_vacuum.sh脚本中,按照部署情况修改脚本前面的几个关键变量值后,保存增加执行权限

使用crontab -e定期调度该脚本,推荐调度频率是一天一次,调度时间为业务低峰期

手动执行的话,输入如下命令:

nohup ./vacuum_full.sh >vacuum_full.log 2>&1 &