Hey there,
I'm currently working on an app to export data from Google Hangouts using the Takeouts JSON file.
Unfortunately I myself have not been using Hangouts TOO thoroughly, soo I need some more example datasets from other users.
If you want to help me, head over to https://www.google.com/settings/takeout#custom:chat and get a copy of your chat history.
Then you can use the following script to anonymize your data. As I can only anonymize data I know of, PLEASE go through the file afterwards and take a look if some information leaked before sending it to me.
If you find data that is not anonymized by the script, please give me appropriate feedback!
Thanks everyone!
I'm currently working on an app to export data from Google Hangouts using the Takeouts JSON file.
Unfortunately I myself have not been using Hangouts TOO thoroughly, soo I need some more example datasets from other users.
If you want to help me, head over to https://www.google.com/settings/takeout#custom:chat and get a copy of your chat history.
Then you can use the following script to anonymize your data. As I can only anonymize data I know of, PLEASE go through the file afterwards and take a look if some information leaked before sending it to me.
If you find data that is not anonymized by the script, please give me appropriate feedback!
Code:
#!/bin/bash
FILE="Hangouts.json"
cp $FILE "$FILE.pseudo"
FILE="$FILE.pseudo"
function randstr(){
echo `perl -e 'printf "%08X\n", rand(0xffffffff);'`
}
O=$IFS
IFS=$(echo -en "\n\b")
#pseudonymize all gaia_ids (and chat_ids of the same value)
GAIA_IDS=`perl -ne '/"gaia_id" : "(?!pseudo:)(.*)"/ and print $1."\n"' $FILE | sort | uniq`
for ID in $GAIA_IDS; do
PSEUDO="pseudo:"`randstr`
perl -pi -e "s/\"$ID\"/\"$PSEUDO\"/g" $FILE
done;
#as far as I've seen gaia_id equals chat_id, but if that's not always the case, let's pseudonymize those, too
CHAT_IDS=`perl -ne '/"chat_id" : "(?!pseudo:)(.*)"/ and print $1."\n"' $FILE | sort | uniq`
for ID in $CHAT_IDS; do
PSEUDO="pseudo:"`randstr`
perl -pi -e "s/\"$ID\"/\"$PSEUDO\"/g" $FILE
done;
FALLBACK_NAMES=`perl -ne '/"fallback_name" : "(?!pseudoname:)(.*)"/ and print $1."\n"' $FILE | sort | uniq`
for ID in $FALLBACK_NAMES; do
PSEUDO="pseudoname:"`randstr`
perl -pi -e "s/\"$ID\"/\"$PSEUDO\"/g" $FILE
done;
perl -pi -e 's/"(text|display_url|link_target|url|image_url)"\s*:\s*".*"/"$1" : "ANONYMIZED_DATA"/g' $FILE
IFS=$O
Thanks everyone!