カールスクリプト - ダウンロードが完了するのを待ちます。

2024-06-20 • tag-icon

curl

以下は、ウェブサイトからファイルをダウンロードするために毎日使用するスクリプトの一部です。しかし、最近ではファイルのダウンロード速度の制限が高まりました。スリープ時間を増やしましたが、他のすべての作業には時間がかかり、ダウンロードするファイルが多く、一部は非常に小さいです。

待ち時間を削除するsleepか、非常に低く設定し、ファイルのダウンロードが完了するのを待つようにスクリプトを変更したいと思います。

編集する：

大容量ファイルのダウンロードが完了しない理由が見つかりました。Failure when receiving data from the peerこの問題をどのように解決できますか？ wgetに切り替えることが最善の選択であることを読んだ。しかし、このスクリプトはwgetでどのように機能しますか？

#check directories are empty, not empty if there was a problem last time
cd /home/user/upload
if [ "$(ls -A /home/user/upload)" ]; then
#     echo 'Directory not empty error for csv manipulation' | /bin/mailx -s "Server scrapeandcleandomains error" use
     echo "$(date) Directory /home/user/upload not empty for csv manipulation"  >> /home/user/logfile
     exit 1
     else
     echo $(date) starting normal >> /home/user/logfile
fi

#create yesterday variable
yesterday=$(echo  $(date --date="$1 - 2 days" +"%Y_%m_%d" ) )
#$(date --date="-2 day" +"%Y_%m_%d")







#download .csv.gz files (old wget command) OBSOLETE!!!!!
#cd /home/user/upload
#wget -R html,"index.*" -A "$yesterday*.csv.gz" -N -r -c -l1 -nd --no-check-certificate --user USERNAME --password PASSWORD -np http://www.websitedownloadfrom.com/sub/
#exit 1

#download index and sanitize > index2.tmp
cd /home/user
curl -u "USERNAME:PASSWORD" -k  http://www.websitedownloadfrom.com/sub/ -o index.html.tmp
links -dump index.html.tmp > /home/user/index.tmp
#this will work until 2049 ONLY!!
sed -i '/20[1-4][0-9]/!d' index.tmp
sed -i '/\[DIR\]/d' index.tmp
for i in {1..50} ; do
   sed -i 's/  / /' index.tmp
done

awk -F" " '{ print $3 }' index.tmp > index2.tmp
sed -i "/^${yesterday}/!d" index2.tmp







#download .csv.gz files according to index2.tmp
while read F  ; do
    cd /home/user/upload
    curl -u "USERNAME:PASSWORD" -k  http://www.websitedownloadfrom.com/sub/$F -o $F &
    sleep 80
done < /home/user/index2.tmp

sleep 60

#check that we downloaded something
cd /home/user/upload
if ! [ "$(ls -A /home/user/upload)" ]; then
    echo 'nothing downloaded from upload'  >> /home/user/logfile
    rm -f /home/user/upload/*
    rm -f /home/user/index.html.tmp
    rm -f /home/user/index.tmp
    rm -f /home/user/index2.tmp
    exit 1
fi

ベストアンサー1

このsleep 80コマンドと直前のコマンドを削除してください&。これをcurl削除すると、次のサイクルを続行する前にスクリプトが&ダウンロードが完了するのを待ちます。curl

ベストアンサー1

おすすめ記事