Adapt to reddit changes, trade grep for jq, parallelism

This commit is contained in:
Gabriel Ostrolucký 2019-02-17 17:55:07 +01:00
parent 41ff9661d0
commit 56120bbe78
No known key found for this signature in database
GPG Key ID: 5414CFD98BE2F107
2 changed files with 17 additions and 18 deletions

View File

@ -3,10 +3,9 @@ Simple Subreddit Image Downloader
Tired of all of those reddit downloaders which want you to install tons of dependencies and then don't work anyway? Me too. Tired of all of those reddit downloaders which want you to install tons of dependencies and then don't work anyway? Me too.
*Simple Subreddit Image Downloader* is bash script which: *Simple Subreddit Image Downloader* is bash script which:
- has minimal external dependencies - downloads ALL images from specified subreddit in full size
- downloads full-size images from subreddits - Linux/MacOS/Windows
- is crossplatform (tested on windows with cygwin) - Parallel download
- uses SSL connection
This script just downloads all directly linked images in subreddit. For more complex usage, use other reddit image downloader. This script just downloads all directly linked images in subreddit. For more complex usage, use other reddit image downloader.
@ -14,10 +13,10 @@ Requirements
============ ============
- bash (cygwin is OK) - bash (cygwin is OK)
- wget - wget
- GNU grep (on MacOS install with `brew install grep --with-default-names`) - jq
Usage Usage
===== =====
`./rdit.sh <subreddit_name>` `./download-subreddit-images.sh <subreddit_name>`
Script downloads images to folder named "down" in current directory. If you want to change that, you need to edit destination in rdit.sh for now. Script downloads images to folder named "down" in current directory. If you want to change that, you need to edit destination in rdit.sh for now.

View File

@ -8,21 +8,21 @@ url="https://www.reddit.com/r/$subreddit/.json?raw_json=1"
content=`wget -U "$useragent" -q -O - $url` content=`wget -U "$useragent" -q -O - $url`
mkdir -p $subreddit mkdir -p $subreddit
while : ; do while : ; do
urls=$(echo -e "$content"|grep -Po '"source": {"url":.*?[^\\]",'|cut -f 6 -d '"') urls=$(echo -e "$content"| jq -r '.data.children[]|select(.data.post_hint|test("image")) | .data.preview.images[0].source.url')
names=$(echo -e "$content"|grep -Po '"title":.*?[^\\]",'|cut -f 4 -d '"') names=$(echo -e "$content"| jq -r '.data.children[]|select(.data.post_hint|test("image")) | .data.title')
ids=$(echo -e "$content"|grep -Po '"id":.*?[^\\]",'|cut -f 4 -d '"') ids=$(echo -e "$content"| jq -r '.data.children[]|select(.data.post_hint|test("image")) | .data.id')
a=1 a=1
for url in $(echo -e "$urls"); do wait # prevent spawning too many processes
if [ -n "`echo "$url"|egrep \".gif|.jpg\"`" ]; then for url in $urls; do
name=`echo -e "$names"|sed -n "$a"p` name=`echo -e "$names"|sed -n "$a"p`
id=`echo -e "$ids"|sed -n "$a"p` id=`echo -e "$ids"|sed -n "$a"p`
echo $name ext=`echo -e "${url##*.}"|cut -d '?' -f 1`
newname="$name"_"$subreddit"_$id.${url##*.} newname="$name"_"$subreddit"_$id.$ext
wget -U "$useragent" --no-check-certificate -nv -nc -P down -O "$subreddit/$newname" $url echo $name
fi wget -U "$useragent" --no-check-certificate -nv -nc -P down -O "$subreddit/$newname" $url &>/dev/null &
a=$(($a+1)) a=$(($a+1))
done done
after=$(echo -e "$content"|grep -Po '"after":.*?[^\\]",'|cut -f 4 -d '"'|tail -n 1) after=$(echo -e "$content"| jq -r '.data.after')
if [ -z $after ]; then if [ -z $after ]; then
break break
fi fi