How to filter ads on your G1

bmdixon

Senior Member
Aug 2, 2010
139
50
0
Just thought i'd add my current script to this thread. After discovering several domains that weren't blocked i got fed up with constantly editing the script file on my phone so i've not got my own hosts file additions on my dropbox account. I can add domains to this list and they get added to the hosts file. Credit goes to "XlAfbk" for the original script.
<MY id> and <myfile> would obviously need to be changed (just get the public url for the file via dropbox)

Code:
/system/xbin/curl -so /data/data/hoststmp. http://hosts-file.net/ad_servers.asp
/system/xbin/curl -s http://winhelp2002.mvps.org/hosts.txt >> /data/data/hoststmp
/system/xbin/curl -s http://dl.dropbox.com/u/<MY id>/<myfile>.txt >> /data/data/hoststmp
cat /data/data/hoststmp | cut -d'#' -f 1 | grep -v gravatar.com | grep -v wordpress.com | awk NF | awk '{sub(/\r$/,"")};1' | sed 's/  \|\t/ /g' | sort | uniq -u > /system/etc/hosts
rm /data/data/hoststmp
 

SubMatrix

Senior Member
Jun 24, 2010
164
13
0
That was it! Thanks so much! My owner was set to "System" and the Owner had no permissions lol. Not sure how that happened.
OK, I spoke too soon. Today I tried Angry Birds again and was getting ads again. I went in to /data/data and deleted everything except the two .lua files in /files and the /lib directory. I rechecked my hosts file and I have the permissions set to rw-rw-rw, with Owner and Group both set to root.

The ads I get are the blue ones again as in the screenshots I posted a few posts back.

Just to make sure, once again the only thing I'm doing to block ads is modifying the hosts file in /system/etc. I'm not doing any of this curl stuff or any linking. Did I forget anything?
 

wraithdu

Senior Member
Aug 28, 2008
284
111
0
EDIT:
Still a slight WIP, working on removed end-of-line comments.

EDIT2:
Ok, got it. First it strips anything that is a comment, so from a # char to the end of the line. Then if anything is left, it converts the line ending and removes extra white space. egrep then removes any html that snuck in and any other lines you want to except (gravatar, wordpress, etc).

I've made some improvements to the script, added an additional source, and fixed a few errors. I got some good info from this site about awk: http://www.pement.org/awk/awk1line.txt

Fixed:
1) Correctly removes leading/trailing/duplicate white space, allowing for proper stripping and de-duplication.
2) Removes comment lines and embedded html lines
3) Fixed uniq command, which was totally removing any duplicate lines. Now it correctly leaves one copy of the duplicate line.

Add any additional excpetions (like gravatar.com) to the egrep line.

Code:
#!/system/bin/sh

/system/xbin/curl -so /data/hoststmp.tmp -G -d hostformat=hosts -d showintro=0 -d mimetype=plaintext http://pgl.yoyo.org/adservers/serverlist.php
/system/xbin/curl -s http://hosts-file.net/ad_servers.asp >> /data/hoststmp.tmp
/system/xbin/curl -s http://winhelp2002.mvps.org/hosts.txt >> /data/hoststmp.tmp
/system/xbin/curl -s http://dl.dropbox.com/s/<ID>/<file>.txt >> /data/hoststmp.tmp
/system/xbin/busybox mount -o rw,remount /system
cat /data/hoststmp.tmp | awk '(NF){sub(/\#.*/,"")};(NF){sub(/\r$/,"");$1=$1;print}' | egrep -v '[\<\>]|gravatar\.com|wordpress\.com' | sort | uniq > /system/etc/hosts
chown 0:0 /system/etc/hosts
chmod 644 /system/etc/hosts
/system/xbin/busybox mount -o ro,remount /system
ls -l /system/etc/hosts
rm /data/hoststmp.tmp
 
Last edited:
  • Like
Reactions: bmdixon

XlAfbk

Senior Member
Aug 11, 2010
1,454
421
113
/system/xbin/curl -so /data/hoststmp.tmp -G -d hostformat=hosts -d showintro=0 http://pgl.yoyo.org/adservers/serverlist.php
that still leaves you with some html as the site doesn't sent the pure list but a html file with the actual list in <pre>

2) Fixed uniq command, which was totally removing any duplicate lines
ooops

1) Correctly removes leading/trailing/duplicate white space, allowing for proper de-duplication.
awk '(NF){sub(/\r$/,"");$1=$1;print}'
that's just combing the 2 awk commands into 1, isn't it?

Still a slight WIP, working on removed end-of-line comments.
cut -d'#' -f 1 should have taken care of all of them, at least i didn't find any remaining ones in my tests.
 

wraithdu

Senior Member
Aug 28, 2008
284
111
0
Ha, cross-posted.

- I updated the first URL to only grab the plaintext page, mimetype=plaintext
- There's an additional awk command, $1=$1, which removes leading/trailing/duplicate white space.
- I added an extra awk command to delete the comments, both leading comment lines, and any line trailing comments (breaks de-duplication if you don't).

Credit where credit is due - I couldn't have modified the script without somewhere to start :)
 

XlAfbk

Senior Member
Aug 11, 2010
1,454
421
113
you can even drop the egrep completly:
Code:
awk '(NF){sub(/\#.*/,"")};(NF){sub(/.*<.*/,"")};(NF){sub(/\r$/,"");$1=$1;print}'
with wordpress:
Code:
awk '(NF){sub(/\#.*/,"")};(NF){sub(/.*<.*/,"")};(NF){sub(/.*wordpress.*/,"")};(NF){sub(/\r$/,"");$1=$1;print}'
edit: even better, just kill anything that contains chars other than a-z, 0-9, ., space
edit2: ignore that, code removed
 
Last edited:
  • Like
Reactions: bmdixon

wraithdu

Senior Member
Aug 28, 2008
284
111
0
Nice. I'm curious about performance of an all-awk command versus leaving the one egrep line in there. Based on file size I'm sure it's negligible. Good to have options though.
 

XlAfbk

Senior Member
Aug 11, 2010
1,454
421
113
here's my updated version:
Code:
#!/system/xbin/bash
/system/xbin/curl -so /data/data/hoststmp -G -d hostformat=hosts -d showintro=0 -d mimetype=plaintext http://pgl.yoyo.org/adservers/serverlist.php
/system/xbin/curl -s http://hosts-file.net/ad_servers.asp >> /data/data/hoststmp
/system/xbin/curl -s http://winhelp2002.mvps.org/hosts.txt >> /data/data/hoststmp
/system/xbin/curl -s http://content.wuala.com/contents/XXX/webaccess/hosts.txt?key=XXX\&dl=1 >> /data/data/hoststmp
cat /data/data/hoststmp | awk '(NF){sub(/\#.*/,"")};(NF){sub(/\r$/,"");$1=$1;print}' | sort | uniq > /data/data/hosts
rm /data/data/hoststmp
added some of wraithdus changes. not removing stuff like wordpress cause I don't need it, using wuala instead of dropbox

edit: removed the command for removing "<" as there shouldn't be any html comments left in the file
 
Last edited:
  • Like
Reactions: bmdixon and larrycl

XlAfbk

Senior Member
Aug 11, 2010
1,454
421
113
Maybe need to escape the < char, ie \< ? Not sure if awk or the shell treats that specially it this context.
nah, was an inconsistency with my local awk version. but I just completly removed that awk command as there's no html left in the file.
 

SubMatrix

Senior Member
Jun 24, 2010
164
13
0
XIA are you using Tasker to automate the launching of this script?

Otherwise what are the steps for actually executing the script once I've copied the script code into a file and placed it somewhere on my phone?
 

wraithdu

Senior Member
Aug 28, 2008
284
111
0
I use Script Manager, I like it better than GScript. Push the script file to /sdcard/scripts and set the folder as your home location in the app. Make sure you are using Unix line endings in the script file (LF, and not CRLF) or it will not execute properly.
 

AAccount

Senior Member
Sep 8, 2010
1,005
1,909
0
I have the same issue as magillos. I checked my webcacheview.db and I see entries from urls like this:

rovio-news-assets.angrybirdsgame.com
rovio-news-app.angrybirdsgame.com
mmv.admob.com

These are clearly in my hosts file:

127.0.0.1 mmv.admob.com
127.0.0.1 appadserver.com
127.0.0.1 caggsm-img.appads.com
127.0.0.1 rovio.appads.com
127.0.0.1 an.appads.com
127.0.0.1 rovio-news-app.angrybirdsgame.com
127.0.0.1 rovio-news-assets.angrybirdsgame.com

Let me outline my process just to make sure I'm not doing anything wrong. I modify the hosts file by adding lines such as above, and then copy it to /system/etc. I then try angry birds. Sometimes I reboot but it doesn't seem to make a difference.

Here are some images of the kinds of ads I see now:


Haven't checked on how my angry birds black lists were doing. I noticed that too and figured out there were rovioXXX.appads.com where XXX is a 1 to 3 digit number. I attached my new one.

And now I need your help. Has anybody figured out how to remove the ads on the new maps 5.7.0 or the google mobile website (the pink highlighted ones)? My standard method of "wiretaping" my phone through wire shark and looking for DNS queries has failed this time.
 

Attachments

  • Like
Reactions: magillos

AAccount

Senior Member
Sep 8, 2010
1,005
1,909
0
Thanks. It removed ads from AB.
Glad it worked for you. Has anybody figured this out yet: "And now I need your help. Has anybody figured out how to remove the ads on the new maps 5.7.0 or the google mobile website (the pink highlighted ones)? My standard method of "wiretaping" my phone through wire shark and looking for DNS queries has failed this time."

Please don't tell me that the IP address of the ad server is hard coded or something awful like that:mad::mad: