macOS/Linux script to download all statuses

Author Topic
kevg

Posted 2024-05-12 15:06:39

Should also work on Windows through Cygwin.

1. Make sure you have the jq utility installed.

2. Create status_cafe.sh:

#!/bin/sh
# usage: status_cafe.sh USER
TIMEOUT_SECONDS=5
VERBOSE=0
ONLYFIRST=0
OPTIND=1
while getopts "ov" opt; do
  case "$opt" in
    o)
      ONLYFIRST=1
      ;;
    v)
      VERBOSE=1
      ;;
  esac
done

shift $(expr $OPTIND - 1 )
TARGETUSER=$1

if [ "${TARGETUSER}" = "" ]; then
  echo "Status Cafe user name is a required argument"
  exit 1
fi

if ! command -v jq &> /dev/null; then
  echo "The jq command must be installed"
  exit 1
fi

createNote() {
  cat << EOF
      {
        "@type": "Create",
        "object": {
          "@type": "Note",
          "content": "${1}"
        },
        "published": "${2}"
      }
EOF
}

processPage() {
  TARGETURL="${1}"
  ANY_ELEMENTS_PRINTED=${2}

  MAINPAGE="$(curl --no-progress-meter --max-time ${TIMEOUT_SECONDS} "${TARGETURL}")"
  RC=$?

  if [ ${RC} -ne 0 ]; then
    echo "Failed to query Status Cafe user page. Check the user name and your internet connection."
    exit 1
  fi
  
  SUBSET="$(echo "${MAINPAGE}" | \
              grep -e status-username -e status-content | \
              sed 's/.*status-content">//g' | \
              sed 's/<\/p>$//g' | \
              sed 's/.*<\/a> \+\([^ ]\+\) \+\(.*\)<\/div>/\1\n\2/g')"

  if [ "${VERBOSE}" -eq "1" ]; then
    echo "processPage ${TARGETURL}"
    echo "=============="
    echo "${MAINPAGE}"
    echo "=============="
    echo "${MAINPAGE}" | hexdump -C
    echo "=============="
    echo "${SUBSET}"
    echo "=============="
  fi

  echo "${SUBSET}" | \
    while read EMOJI; do
      read TIME_AGO
      read STATUS
    
      DATE_ISO8601="$(TZ=UTC date -d "${TIME_AGO}" +%Y-%m-%dT%H:%m:%S%Z)"
      ESCAPED_STATUS="$(echo -n "${STATUS}" | jq -Rsa . | sed 's/^"//g' | sed 's/"$//g')"
      # JSON doesn't allow a slash on the last array element
      if [ ${ANY_ELEMENTS_PRINTED} -eq 1 ]; then
        printf ",\n"
      else
        ANY_ELEMENTS_PRINTED=1
      fi
      NOTE="$(createNote "${ESCAPED_STATUS} ${EMOJI}" "${DATE_ISO8601}")"
      echo -n "$NOTE"
    done
    printf "\n"

  if [ "${ONLYFIRST}" -eq "0" ]; then
    # Check if there's a next page
    NEXTPAGE="$(echo "${MAINPAGE}" | grep "Older statuses" | sed 's/.*page=//g' | sed 's/".*//g')"
    if [ "${NEXTPAGE}" != "" ]; then
      processPage "https://status.cafe/users/${TARGETUSER}?page=${NEXTPAGE}" 1
    fi
  fi
}

cat << EOF
{
  "@context": "https://www.w3.org/ns/activitystreams",
  "@type": "Person",
  "@id": "${TARGETUSER}",
  "name": "${TARGETUSER}",
  "outbox": {
    "@type": "OrderedCollection",
    "orderedItems": [
EOF

processPage "https://status.cafe/users/${TARGETUSER}" 0

cat << EOF
    ]
  }
}
EOF

3. Make it executable:

chmod a+x status_cafe.sh

4. Execute it and re-redirect output to a JSON file, replacing USER twice with your user name:

./status_cafe.sh USER > USER.json

The format is JSON-LD and I simulated what an ActivityPub feed might be like though I'm not sure what the latest pseudo-standard is. Nothing imports this yet. But at least it's a way to save statuses in a pseudo-standardized format.

Last edited on 2025-04-22 20:53:29

beesilisk

Posted 2024-08-01 04:35:37

ty <3