macOS/Linux script to download all statuses

Author Topic
kevg

Posted 2024-05-12 15:06:39

Should also work on Windows through Cygwin.

1. Make sure you have the jq utility installed.

2. Create status_cafe.sh:

#!/bin/sh
# usage: status_cafe.sh USER
TARGETUSER=$1
TIMEOUT_SECONDS=5

if [ "${TARGETUSER}" = "" ]; then
  echo "Status Cafe user name is a required argument"
  exit 1
fi

if ! command -v jq &> /dev/null; then
  echo "The jq command must be installed"
  exit 1
fi

createNote() {
  cat << EOF
      {
        "@type": "Create",
        "object": {
          "@type": "Note",
          "content": "${1}"
        },
        "published": "${2}"
      }
EOF
}

processPage() {
  TARGETURL="${1}"
  ANY_ELEMENTS_PRINTED=${2}

  MAINPAGE="$(curl --no-progress-meter --max-time ${TIMEOUT_SECONDS} "${TARGETURL}")"
  RC=$?

  if [ ${RC} -ne 0 ]; then
    echo "Failed to query Status Cafe user page. Check the user name and your internet connection."
    exit 1
  fi

  echo "${MAINPAGE}" | \
    grep -e status-username -e status-content | \
    sed 's/.*status-content">//g' | \
    sed 's/<\/p>$//g' | \
    sed 's/.*<\/a> \+\(.\) \(.*\)<\/div>/\1\n\2/g' | \
    while read EMOJI; do
      read TIME_AGO
      read STATUS
    
      DATE_ISO8601="$(TZ=UTC date -d "${TIME_AGO}" +%Y-%m-%dT%H:%m:%S%Z)"
      ESCAPED_STATUS="$(echo -n "${STATUS}" | jq -Rsa . | sed 's/^"//g' | sed 's/"$//g')"
      # JSON doesn't allow a slash on the last array element
      if [ ${ANY_ELEMENTS_PRINTED} -eq 1 ]; then
        printf ",\n"
      else
        ANY_ELEMENTS_PRINTED=1
      fi
      NOTE="$(createNote "${ESCAPED_STATUS} ${EMOJI}" "${DATE_ISO8601}")"
      echo -n "$NOTE"
    done
    printf "\n"

  # Check if there's a next page
  NEXTPAGE="$(echo "${MAINPAGE}" | grep "Older statuses" | sed 's/.*page=//g' | sed 's/".*//g')"
  if [ "${NEXTPAGE}" != "" ]; then
    processPage "https://status.cafe/users/${TARGETUSER}?page=${NEXTPAGE}" 1
  fi
}

cat << EOF
{
  "@context": "https://www.w3.org/ns/activitystreams",
  "@type": "Person",
  "@id": "${TARGETUSER}",
  "name": "${TARGETUSER}",
  "outbox": {
    "@type": "OrderedCollection",
    "orderedItems": [
EOF

processPage "https://status.cafe/users/${TARGETUSER}" 0

cat << EOF
    ]
  }
}
EOF

3. Make it executable:

chmod a+x status_cafe.sh

4. Execute it and re-redirect output to a JSON file, replacing USER twice with your user name:

./status_cafe.sh USER > USER.json

The format is JSON-LD and I simulated what an ActivityPub feed might be like though I'm not sure what the latest pseudo-standard is. Nothing imports this yet. But at least it's a way to save statuses in a pseudo-standardized format.

Last edited on 2024-05-12 15:14:55