WordLog

A weblog authored by Carthik about the latest in the WordPress world.

Thursday, May 19, 2005

Help Needed with Google

Filed under: — Carthik @ 12:02 pm

Please think of this as a public appeal for help.

Wordlog.com is not indexed by google, almost a year after I started publishing this blog. You might have seen me going slow around these parts every once in a while. The reason for this is google. I figure, since this site is not indexed, the posts I make here will not show up when someone is searching for the relevant information. So why bother? All the visitors I get to the blog are from links from other blogs, and I cannot thank you all enough for that.

Vis-a-vis google, I have contacted their support department many times, without ever getting a favorable response. When I wrote the first time, the response I got was that google thinks that this website is a link farm, or something of that sort, and that I have to follow the webmaster guidelines. I took the effort to check that there are no hidden links, or any violations of their suggestions. There are none – this is a just a normal WordPress weblog, with minimal modifications, if any.

The motive for starting this blog was two-fold: inform WordPress users of the latest developments, through informative articles and news, beyond the “here’s a new plugin/theme” information, and, to make money, through advertising. Looks like I am failing both ways.

So, if any of you kind readers would have any influence on this subject, and can help me sort out this issue with google at the soonest, I would greatly appreciate it. Please leave me a comment if you can help. Like I said, I have already tried emailing google’s support team, and that did not help. Please suggest alternative, honest, methods of resolution, or if you work at google and can do something to help me, please let me know through a comment.

If nothing works out and I am still as hopeless at the end of two weeks, I will make this content available at WebLog Tools Collection, since I have already discussed it with Mark. That way, at least you, the WordPress users, win!

Update:

I thought I might answer the few questions and reply to some of the suggestions here in the post, so future visitors can read them, too :

Validate your meta keys dude. Validating it would help (Mostly, make all letters CAPITAL and add / before the ending >.

I beleive I have validated the pages here. Did so again today. Google was never too serious about XHTML validation, now was it?

REMOVE ALL FRAMES you have running on this site.

I do not use frames.

Keep spamming http://www.google.com/addurl/?continue=/addurl (Just kidding :P )

Right!

If you use ASP, Your in bad luck. Googlebot hates ASP.

This site proudly uses WordPress (PHP).

Maybe try removing the commented RSS feed stuff from your template as they add up to a lot of links that could be described as cloaked and even though they are just links to RSS feeds on your own site it could be enough to upset Google.

I rather like providing RSS feeds for comments – I for one, use comment feeds to track conversations. I shouldn’t have to do this – there is nothing spooky about the poor feeds :)

DO NO EVIL. There are some sharks in these waters that can get away with it, but Google spends alot of time trying to weed these PageRank spammers out. One you get blacklisted, that is IT.

I am not evil. I do no evil either.

Put up great content

One can only try :)

Do not ‘link farm’.

Never did add a link without it having some purpose. I do not link farm, if a blogroll is not a link farm, and mine isn’t even really long.

Forget google. I know ‘google’ is now a verb, but you also get traffic from yahoo, MSN, etc. Get listed there, wait a while, and goole will list you too.

The other search engines have had me from the third day since starting. One cannot overlook the fact that google is by far the most popular, though.

Start fresh with a new domain. Keep the current one working, but do NOT cross-post. The spiders see that and remove the duplicated sites, and also :
one “potential” solution is to change your sites main url to www.wordlog.com instead of just wordlog?

I refuse to. I mean, I get to choose the domain name, so this is not an option.

Well, I guess it’s not lack of linkage …, and also :
you might want to get a lot of people to lonk to you.

No it is not a lack of linkage. Photomatt was the first to link to me, and on the same day, scores of others linked to this site. Articles from this site appear on the WordPress Planet and in the dashboards of thousands of WordPress-driven blogs.

Did you used to have a huge ammount of links on here?

No, never did.

You wrote that you want to make money from ads. I don’t see any. Any good reason?

Anyone can advertise on this blog if they are interested. There are the blogads ad spots, in the right menu, where you can start, if you are interested in helping me make my first million.

Do you have any htaccess/ip bans in place?
Also fix up your css. http://wordlog.com/print.css returns a file not found. Normally I wouldn’t think it’d be a huge issue, but it couldn’t hurt to fix.
Was the site ever listed in google? Was it previously owned before you had it?

No, I am not trying to block anyone, in fact. The print.css was there until yesterday. I upgraded yesterday. I will restored it, and the robots.txt shortly, not that it seems to matter. This site has never been listed on google. I even tried submitting the site to dmoz.org (the google directory) without success.

65 Comments

  1. The commenting was broken. It works now!

    Comment by Carthik — 5/19/2005 @ 1:05 pm

  2. A link farm hey? Doesn’t sound good. I don’t see as many links here as other sites that ARE indexed by google. I haven’t had any trouble with them, googlebot hits my site hundreds of times throughout the day on a typical day.

    Maybe you could get another domain and point it here. Submit that domain to be crawled and see if it does. That’s the only thing I can think of. Maybe this domain is somehow flagged in Google.

    Comment by tyler — 5/19/2005 @ 1:10 pm

  3. Wordlog.com – Help Needed with Google

    Anyone able to offer any insight into why a site isn’t listed on Google.com after being in existance for a year?
    “Please think of this as a public appeal for help.
    Wordlog.com is not indexed by google, almost a year after I started publi…

    Trackback by Is there a PC Doctor in the house? — 5/19/2005 @ 1:10 pm

  4. tyler, I know from the google guys that the site is “flagged”. I really wouldn’t want to deal with getting a different domain name and all that jazz – I sort of like “wordlog.com” – besides it is ridiculous to have to do that, just because Google is acting weird.

    Comment by Carthik — 5/19/2005 @ 1:24 pm

  5. Maybe try removing the commented RSS feed stuff from your template as they add up to a lot of links that could be described as cloaked and even though they are just links to RSS feeds on your own site it could be enough to upset Google.

    Do that and then try contacting them again for reinstatement or just wait and see if your site gets listed after the next indexing takes place.

    Comment by Jason Bainbridge — 5/19/2005 @ 1:45 pm

  6. What do you man ‘flagged’? Can you be more explicit?

    It sounds like Google once thought your domain was bad, and now it will be indexed. If that is true, you have to beg, plead, pay to have it indexed.

    I definitely know how to increase page range in Google. I founded the OpenDomain program ( http://OpenDomain.Org ) as a way to promote domains. To see, check out the some of our domain keywords: “Xaml”, “bloog”, “greylisting”, “free tv” or even “wordpress”. We would even be happy to cross link with you to promote your site.

    Here are are some general rules for search
    1) DO NO EVIL. There are some sharks in these waters that can get away with it, but Google spends alot of time trying to weed these PageRank spammers out. One you get blacklisted, that is IT.
    2) Put up great content
    3) Do not ‘link farm’. You can cross-index, but to automate it or to grab multiple domains just to increase your PR will not work for long. OpenDomain does have alot of domains, but we do NOT abuse the system. all of our links are ‘quality’ links.
    3) Forget google. I know ‘google’ is now a verb, but you also get traffic from yahoo, MSN, etc. Get listed there, wait a while, and goole will list you too.
    4) Start fresh with a new domain. Keep the current one working, but do NOT cross-post. The spiders see that and remove the duplicated sites.

    Comment by Ric — 5/19/2005 @ 1:45 pm

  7. Well, I guess it’s not lack of linkage — if you search for ‘wordlog’, several links to you appear rigt on the first page (namely on Weblog Tools Collection, Photomatt, etc.).

    Comment by João Craveiro — 5/19/2005 @ 2:08 pm

  8. I spend quite a lot of time on SEO (search engine optimisation); as far as I can tell there are no “blackhat” like SEO strategies in place here at all.

    I’d make a post over at forums.seochat.com and see if anyone can help you get out of google’s bad books – there are a few people there who have the same problem… one “potential” solution is to change your sites main url to http://www.wordlog.com instead of just wordlog?

    I wrote a guide on normal seo tactics etc (here) but from the looks of it the site is performing well on other engines (view it at http://www.uptimebot.com/sql/one.php)

    Comment by fx — 5/19/2005 @ 2:11 pm

  9. wordlog.com > Help Needed with Google

    wordlog.com > Help Needed with Google…

    Trackback by Sideblog — 5/19/2005 @ 2:28 pm

  10. that seems really weird, my site got indexed after a day i think, and i didnt even have any incoming links… you might want to get a lot of people to lonk to you. also are you listen in google’s directory? its shared with a lot of search engines, really works…

    Comment by OMEITOR — 5/19/2005 @ 2:28 pm

  11. Why would it be “flagged” ?

    Comment by Ozh — 5/19/2005 @ 3:21 pm

  12. You wrote that you want to make money from ads. I don’t see any. Any good reason?

    Elad

    Comment by Elad — 5/19/2005 @ 3:39 pm

  13. Yah, the reason for the flagging should be revealed. Did you used to have a huge ammount of links on here? It doesn’t look like a link farm unless you’ve got a bunch of those hidden pages…

    I think it’d be interesting if you just bought another domain and have it forward to this site. Just to see if google will index the other domain with the exact same content. GoDaddy has domains for $8.95 a year. I mean you don’t even need to give the domain out anywhere, just submit it to google.

    Comment by tyler — 5/19/2005 @ 3:54 pm

  14. I can’t figure it out either. This site has TONS of links. Doesn’t make any sense that with all the links pointing here it would still not even be indexed in Google, much less ranking for terms. I’m at a loss.

    http://web.archive.org/web/20040404083123/http://wordlog.com/

    Maybe this has something to do with it?

    Were you ever even indexed in Google and getting hits? Or was it always like this?

    Comment by Recipher — 5/19/2005 @ 3:57 pm

  15. 1) Validate your meta keys dude. Validating it would help (Mostly, make all letters CAPITAL and add / before the ending >.
    2) REMOVE ALL FRAMES you have running on this site.
    3) Keep spamming http://www.google.com/addurl/?continue=/addurl (Just kidding :P )
    4) If you use ASP, Your in bad luck. Googlebot hates ASP.
    5) Googlebot thinks your site is a spamming site or It gets so much traffic, It doesnt bother to index your site.

    Comment by XeroCool — 5/19/2005 @ 4:58 pm

  16. Ozh, tyler, like I said in the post, they thought it was a link farm, and suggested that I conform to the webmaster guidelines. I unfortunately don’t have a copy of the email they sent me, so there.

    Comment by Carthik — 5/19/2005 @ 5:16 pm

  17. I’ll post updates and replies to your suggestions in the article, instead of as comments. I have replied to all the comments above this one for now.

    Comment by Carthik — 5/19/2005 @ 5:44 pm

  18. The DNS entry for google.com gives a phone number of +1.6503300100 for Administrative, Technical, and Zone contact. You could try calling them, though I doubt it will be very effective.

    Comment by David Nagle — 5/19/2005 @ 6:08 pm

  19. About the validation, Googlebot does take it seriously since If your site doesnt validate, The site when It gets cached is screwed up and might not let GoogleBot index those links and stuff.

    Comment by Xerocool — 5/19/2005 @ 6:46 pm

  20. Do you have any htaccess/ip bans in place? possibly you blocked the ip of their bot.

    While it shouldn’t matter, it wouldn’t hurt to make a simple robots.txt file and upload it to the root.

    also fix up your css. http://wordlog.com/print.css returns a file not found. Normally I wouldn’t think it’d be a huge issue, but it couldn’t hurt to fix.

    Was the site ever listed in google? Was it previously owned before you had it? Or was it used for anything shady before it was in the state that it is in now? Possibly it’s a ban that hasn’t been lifted from a while back. Just a few thoughts off of the top of my head.

    Comment by mike — 5/19/2005 @ 6:47 pm

  21. [...] e

    Thursday, May 19th, 2005 at 4:58 pm by XeroCool

    WordLog needs help with GoogleBot. It won’t crawl their si [...]

    Pingback by WordLog Help With Google » r0x0rz — 5/19/2005 @ 6:58 pm

  22. Avoid linking to yourself via direct URLs. That is, instead of [a href=“http://wordlog.com/archives/2005/”]2005 archives[/a] use [a href=“/archives/2005/”]2005 archives[/a]. I had this problem once, and a week after I removed the link to myself, I ranked 1st in a search for “jona’s blog.” (The site has since been renamed and relocated, so I don’t know if I’m still first.) Hope this helps.

    Comment by Jonathan Fenocchi — 5/19/2005 @ 7:22 pm

  23. Jonathan,

    I can’t do away with absolute links so easily, and considering as all WordPress blogs have absolute links, I don’t think this is the problem.

    Comment by Carthik — 5/19/2005 @ 7:26 pm

  24. You’ve probably run into a situation where the previous owner of the domain was spamming or doing unsavory things and thereby got into Google’s bad graces. I don’t know why Google has been non-responsive to the situation, though, since it’s bound to generate a lot of bad publicity for them.

    Comment by IO ERROR — 5/19/2005 @ 7:28 pm

  25. Will Google ban you for not using www?

    Carthik’s WordPress-related site, WordLog, has been banned from Google since its inception in 2004. And nobody can figure out why.
    Google’s support department apparently always responds back saying the site is part of a spam link farm, bu…

    Trackback by www. is not deprecated — 5/19/2005 @ 7:42 pm

  26. I’ve not had www on Photo Matt or WordPress for about two years and both are PR8, so I highly doubt that has anything to do with it.

    Comment by Matt — 5/19/2005 @ 7:57 pm

  27. [...] i: Frustrasjoner » Ren idioti av Kyrre Baker, 02:07:08 Dersom litt av det som står her er sant vil jeg si at jeg synes Google muligens har blitt litt for store [...]

    Pingback by Er Google for store? — 5/19/2005 @ 8:07 pm

  28. no that has nothing to do with it. though i would at least have all http://www.wordlog.com redirect to it without the www. google has a problem where it will index http://wordlog.com/ and http://www.wordlog.com as two seperate pages, thus giving you a duplicate content penalty.

    Comment by mike — 5/19/2005 @ 8:19 pm

  29. It could be where you are hosted. If someone linkspamed from your IP/domain/IP block previously google might blacklist it. www or not is not an issue. Google have some clues and know what that’s all about.

    Have you looked at your server logs? Are you being spidered?

    Comment by Joseph Lindsay — 5/19/2005 @ 10:07 pm

  30. I’m not keen enough to give details, but perhaps a robots.txt file could help/be the problem?

    I used one to keep my pages from being cached.

    Comment by Andy — 5/19/2005 @ 10:09 pm

  31. [...] on my somewhat b0rked laptop earlier today and only had Dillo to browse with when I caught Carthik’s plea for help with Google inclusion.It is utterly baffling that s [...]

    Pingback by Team Murder » The Google-opticon Strikes Again — 5/19/2005 @ 10:19 pm

  32. I’d recommend that you forget about Google. They have become very dictatorial and are becoming very politically correct in who they allow into their Adsense program. They are quickly becoming a company with which I really don’t want to be associated.

    They labeled my site offensive (yeah, right) because my site has a Christian focus and also because I had written many Bible study notes/devotionals. When I pressed for details as to what makes a web site a non-offensive site they just ignored my emails.

    I get much more hits from other search engines such as Yahoo!, MSN, etc. than I do from Google anyway. That’s despite being ranked in the top 5 to 10 in many categories on Google.

    Comment by Larry — 5/19/2005 @ 11:53 pm

  33. You have been penalized, no doubt. Inquire with ‘G’ to have the ban lifted.

    Comment by Vlad — 5/20/2005 @ 12:18 am

  34. I think IO ERROR is on to something — what is the history of this domain? I’m having a similar problem which I suspect is the result of the previous owner of a domain having used it for spamming/squatting/evil things.

    Comment by Mike — 5/20/2005 @ 12:20 am

  35. Andy, there is a robots.txt file now.

    Should I be penalized for the actions of someone that owned the domain before me?

    and Vlad, I have contacted “g” many times over.

    Comment by Carthik — 5/20/2005 @ 1:38 am

  36. http://web.archive.org/web/20040301024913/http://wordlog.com/

    your site was like that for 2 months (at least)

    Comment by fx — 5/20/2005 @ 2:52 am

  37. Googlebot being picky about html validation is a joke. 472 errors on the main page of a site I run, and it gets indexed by the Googlebot more than needed.

    Google indexes pages and words, they don’t give a f*ck about wether your site is correctly displayed in Mac IE 4.1, uses semantic markup or passes accessibility tests.

    Comment by Ozh — 5/20/2005 @ 6:18 am

  38. you said: “I even tried submitting the site to dmoz.org (the google directory) without success.”

    you want to find an editor or meta-editor (linked at the bottom of a relevant category, or parent to a relevant category) and contact them to follow-up on your submission

    dmoz.org is a human managed listing, so you do have to be patient (and then be patient again until the public version of the listing gets updated, then wait for google to get it… can you tell I’ve done this before?)

    as far as I’m aware they’re referenced, but not owned, by the big G

    Comment by d@vid — 5/20/2005 @ 7:54 am

  39. Read this article by Danny Sullivan. Maybe you’ll get a clue: http://blog.searchenginewatch.com/blog/050331-211635

    Comment by Lars — 5/20/2005 @ 9:10 am

  40. Lars -

    That situation has already been rectified by both Matt and Google. It was a bad idea, Matt owned up to it, and problem is solved. It’s amazing how many residual sour grape wineries were spawned and apparently maintained waaaay after the fact by that very odd week. Aside from that, how would Carthik be identified by Google with Wordpress.org directly? Maybe they’ve got a guilt by associative empathy algorithm in beta that we just don’t know about?

    Comment by goneaway — 5/20/2005 @ 3:49 pm

  41. I know this sounds like spam, but I cannot recommend Selfpromotion.com enough. It’s a totally free site submission site that does away with all the gimmicks and games.

    But there are some things you can do yourself. When you can’t get in the front door, go in the back. I did a bunch of research over a year ago on what sites cull from other sites. Learned a valuable lesson. Go to DMOZ at http://dmoz.org/add.html and enter your site information. Most all of the serious sites pull from here. It works. Once I was in DMOZ, I was in just about all the search engines.

    All free, and it works. It ain’t fun but you can get around Google by getting in the back door.

    Comment by Lorelle — 5/20/2005 @ 9:07 pm

  42. Regarding DMOZ (the Open Directory Project), you are already listed. Check http://editors.dmoz.org/Computers/Internet/On_the_Web/Weblogs/Resources/

    You were listed there on 2004-09-30 at 11:52:47 CDT.

    Thus, I doubt that talking to metas or such as other people have recommended will do any good. DMOZ has no control over what Google does. Either Google hasn’t updated their ODP dump since before Sept 2004 or they specifically have dropped your site for some reason after importing the ODP dump.

    Comment by David Nagle — 5/20/2005 @ 10:01 pm

  43. Grr, that link should be:
    http://dmoz.org/Computers/Internet/On_the_Web/Weblogs/Resources/

    I posted the link to the editors server by mistake. My apologies.

    Comment by David Nagle — 5/20/2005 @ 10:03 pm

  44. [...] a couple of days visiting the site and personally do not wish to see it disappear. Take a LOOK at his plea. This entry was posted o [...]

    Pingback by Ruben’s Blog » Blog Archive » Google - Index Wordlog! — 5/21/2005 @ 1:21 am

  45. try to change IP for you server

    Comment by mike — 5/21/2005 @ 12:39 pm

  46. My thoughts:
    http://bloggerruggles.blogspot.com/2005/05/crowning-king.html

    Comment by SteveG — 5/21/2005 @ 4:26 pm

  47. A bit in the vein of “you also get traffic from yahoo, MSN, etc. Get listed there, wait a while, and goole will list you too.”, do submit your site in the proper category at http://www.dmoz.org , the Open Directory Project. Any not google websearch that is based on that (and lots of directories are – AOL comes to mind too) will show you then.

    Sorry I can’t help with google. Ganbatte!

    Comment by Estara — 5/21/2005 @ 6:45 pm

  48. [...] and in the dashboards of thousands of WordPress-driven blogs.

    Which don’t count — Google can’t see dashboards, can it?

    Comment by João Craveiro — 5/21/2005 @ 6:57 pm

  49. @David Russell: I guess it’s my name (the a with a tilde seems not to be escaped to entity…)

    Comment by Joao Craveiro — 5/21/2005 @ 7:02 pm

  50. Man, are you a persistent one! If ‘G’is that important to you, reload your site with a different web address. The time wasted could have already appeased you. You seem so self defeating.

    Comment by Margret — 5/22/2005 @ 3:12 am

  51. http://www.whois.sc/wordlog.com shows you in DMOZ. I have no idea what’s up, but a DMOZ listing will lead to extra traffic.

    Comment by KC — 5/22/2005 @ 9:23 pm

  52. Try signing up for Adsense and putting it on the site. They have to spider the page for Adsense to produce ads. They claim it won’t improve your ranking, but they still have to spider the site or it won’t work.

    Comment by Tom Hanna — 5/23/2005 @ 5:06 am

  53. Have you seen *any* requests from the Google netblock(s) or a user agent called Googlebot? If not, try and check other sites hosted on the same server. If they don’t get any visits from Googlebot, maybe there’s a firewall or routing issue somewhere upstream.

    I checked your site through Google’s translation service (IP: 216.239.*.*, prolly 216.239.38.136), so at least that host or block seems to be able to get here.

    Comment by Jan! — 5/23/2005 @ 8:26 am

  54. [...] Google Filed under: Word Press, Stuff I look At, geek — benstraw @ 3:15 pm wordlog.com » Help Needed with Google You might have seen me going slow around t [...]

    Pingback by benstraw.com » wordlog.com » Help Needed with Google — 5/24/2005 @ 3:15 pm

  55. Did you email webmaster-at-google.com with the subject ‘reinclusion request’? Tell that it seems the previous owner has used this domain for for a link farm pointing to epilot.com. This can be seen here: http://web.archive.org/web/*/http://wordlog.com
    Spam at the beginning of the year (2 links), later your blog.
    epilot.com is as well banned by google.
    I made the experiance, that google really looks at sites you ask for a reinclusion request. It usually took 3 month.

    Comment by Alain — 5/24/2005 @ 6:37 pm

  56. I am just wondering if this a server and header thing. Bots can be picky about redirects and such. I dont know how you are set up in that regard. Just my 2 cents.

    Comment by Root — 5/24/2005 @ 7:32 pm

  57. I would try adding some small Google text ads to the side of your page. That will sort of force Google to index the content so they can serve relevant ads.

    Comment by Rian — 5/26/2005 @ 8:07 pm

  58. I’d also suggest putting adsense ads up. Not only would that force Google to spider your site, but you wouldn’t have to worry about having blank advertising space. Google will fill it for you.

    Comment by xanzi — 5/27/2005 @ 10:59 am

  59. Perhaps the lack of information in your domain name registration is affecting the credibility of your site?

    Comment by Manuel — 5/31/2005 @ 6:03 am

  60. http://www.widexl.com/remote/search-engines/metatag-analyzer.html
    This site says, that you have 182 URLs in your site. This might be a problem and a reason for google to block your site as it might be considered as spam.

    Comment by Johannes — 5/31/2005 @ 7:23 am

  61. The official way to get out of a Google ban:

    1. Go to: http://www.google.com/support and select the appropriate answers.

    2. For your message, use the words “reinclusion request” as the title and add a short explanation of the problem.

    3. Wait…….

    Is Googlebot hitting your site at all?

    Comment by encyclo — 5/31/2005 @ 3:20 pm

  62. Google just came out with this:

    Google Sitemaps

    https://www.google.com/webmasters/sitemaps/login

    Might help your cause.

    Comment by Chris — 6/3/2005 @ 10:41 am

  63. Apollo,

    That is an interesting article, unfortunately in my case, it wasn’t just a lowering in the ranking — I wasn’t indexed at all.

    In any case, thank you for the comments, everyone. I REALLY appreciate it, and my heart is swollen with joy, at knowing that this website matters to you, too :)

    I will get something done about this, somehow. Maybe change the domain, or write to them again, one last time.

    Comment by Carthik — 6/5/2005 @ 1:11 am

  64. Technorati notes over 290 inbound links for Huffington’s Toast (which gets nearly 3000 unique visitors a day and has been written up by the National Review) and G says there are no inbound links. MSN and Yahoo have cite many.

    My concern is that G may be suppressing sites for political reasons.

    Comment by Huffington's Toast — 6/12/2005 @ 5:35 pm

  65. [...] June 11th was the first time I noticed WordLog in google’s search results. After soliciting your help, I had written an email to the folks at google, with the sub [...]

    Pingback by wordlog.com » Now in google — 6/12/2005 @ 6:31 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

 

Powered by WordPress

eXTReMe Tracker