If You Post it, they will SPAM it

Most people don't care for spam, in all forms.
I hope that my Internet savvy readers will know that you should never, ever, post a personal email address online in public view. In personal emails, in password-protected forums, sure, post away, but otherwise, posting an email address in plain-text is a one-way ticket to SPAM-ville.
So, if you already know this stuff, why am I writing about it? Because, obviously, not everyone does. Over the past year, I’ve been responsible for the design and upkeep of a local church web site. Of course, one of the best (nerdy) perks is being able to analyze all the unique stats that roll in. One very helpful metric, the “search engine terms” metric, as its commonly referred to, shows you what people terms or phrases that a visitor bounced off a search engine in order to find your site. An interesting trend began to appear after awhile; one that I hadn’t seen before. It seems that someone, or something, had come to the site after searching for something such as “church in california @hotmail.com.” At first, I only saw a couple of these, but after awhile, these hits began to occurring weekly with different phrases, “pastors in california @hotmail.com,” “email contacts of pearsons @hotmail.com,” “prayer 2009 @gmail.co.th,” and so on. After digging into the stats more, I was able to pull the IP address of the machines that had landed on the site after those searches. The IP address? 74.125.77.132. I’ll wait a second for the nerds to run a trace.
Weird, huh? That address points squarely at Google. Not all the searches had that address attached to them. For example, one search traced back to Togo Telecom, an ISP in France.
No doubt some of you already know what this is all about, but just in case, I’ll dispense with the details of my theory. The hits are coming from bots which are programmed to harvest email addresses for specific campaigns. Yes, even church pastors and staff get spammed from “religious” organizations with special “services” to sell. The method of querying a couple of keywords, then a popular email provider is actually pretty smart in a, let-someone-else-do-the-heavy-lifting kind of way. The hits from Google are most likely a result of the bot choosing to visit the cached link — a snapshot of the web page as it was indexed — provided by Google for each search result so the coveted email address it seeks will still be available on the page, just waiting to be added to a list of email addresses for sale. A search engine bascially hands a list of pages to a bot with email addresses on them, making it even faster to crawl pages than to randomly bounce from site to site hoping to find them.
For example, if I wanted to spam people who are involved with Relay for Life I would search for “relay for life @yahoo.com,” or if I had a fraudulent operation running on fake Scantron forms, I could search “school teacher @hotmail.com.”
So, in review: Don’t post any email address online in plain-text, unless, of course, you enjoy the extra reading material. Currently, the safest way to allow web site visitors to contact you is to use a temporary “throw-away” address, or a form with CAPTCHA verification. Another method I consider safe enough is generating an image that shows the email address without actual text on the page (don’t use the mailto link either!). Any of these email image generators will do. Though your email address appears on the page, it isn’t easily read by a bot harvesting email addresses from text. Though the technology is there, as far as I know, very few spammers bother with OCR (optical character recognition) technology since there are still so many good addresses readily available in plain text.
I wonder what would happen if I Googled “looking for unheard of foreign entity to transfer large sums of cash with no assurance of legitimacy @citibank.com.”
T-Mobile Ge 1
Today’s lesson is a look into phonetic spelling, and the ramifications of including those in your posts! You may notice that the title of this post, “T-Mobile Ge 1″ is meant to refer to the HTC T-Mobile G1 (Android) phone, but the “G” is spelled phonetically. I decided to do this upon a discovery using my G1 after I received the RC-33 update adding the functionality to perform a Google web search by utilizing voice-to-text. When I attempted to search “T-Mobile G1″ it came out as “t mobile g 1,”which gave me pretty much the same results as if I had searched “T-Mobile G1.” This led me to perform searches based off of variations of “T-Mobile G1.” While I searched, Google provided suggestions, one of them being “T-Mobile Ge 1.” Google suggests queries based of off similar terms with relative popularity. I figured if “T-Mobile Ge 1″ appeared in the suggestions, then enough people must be searching it. Plus, once I executed the search, nothing immediately obvious appeared, though, a few blog postings returned based off the occurance of “T-Mobile” and the official site was listed second. So, this post will serve as an experiment to see if people will come here after searching for “T-Mobile Ge 1.” Though, I suspect once the search results are returned, he or she will immediately realize it was spelled incorrectly. But… I shall see!
I Say Old Bean! [More Search Term Awesome-ness]
The search term “clever ruse” became a popular avenue to 365 as of late. When investigating, I was surprised to learn that my blog had become the number 1 Google Image result when searching for “clever ruse” in Google Images. The culprit is this post about the new Google fav icon.
I love this stuff! Now, I just gotta write more meaningful content, or the search term fun will end.
Maybe more posts about that “certain part of the male body.” (SFW)

Is this fact, or a clever ruse?
Penis on the Brain
I’m still amazed that the top search term for this site, every.single.day. is “penis!” It’s hilarious!
Though, I’d still like to know which search engine puts my blog up high enough that it would even be in the running for relevant penis web sites.
The Odd Things People Search For
I noticed today that the word “penis” or a phrase conatining the word “penis” brings the most people to this blog, and specifically the post “The Penis Threat Level [The Daily Show].” While CHMODing with Dreamweaver is still popular, it’s dropped to 2nd place… behind penis.
Just for kicks, here’s the top 10 search terms for the last 30 days in descending order for this blog… the word penis almost doubles the number of hits brought in by 2nd place:
| penis | ||
| meaning of serendipity | ||
| clever ruse | ||
| chmod dreamweaver | ||
| do it live remix | ||
| chmod in dreamweaver | ||
| dreamweaver chmod | ||
| dreamweaver file permissions | ||
| finalizing dvd | ||
| att dsl limits |
SEO Page and File Names
Just a quick note because I thought this was a really good piece of information to know.
As far as search engine optimization (SEO) goes, a file or page name like this_is_a_file_name.htm will appear as one word to search engines like thisisafilename. So its unlikely it’ll be picked up in searches.
However, if you use hyphens instead of underscores like this-is-a-better-file-name.htm search engines will read it as this is a better file name thereby greatly increasing readability and searchabilty (yeah, it’s not a word, but I like making words here at 365D)
I have a lot of good stuff coming and lots of stuff in drafts. Hopefully, I’ll be able to get back to daily posting this next week.


