Free tech support header

How can I debug a Japanese-language Google search?

On our site: soysource.net I have added the sample code from your website (see Add a Google Search to your Website) to the bottom of ours, I customized it into Japanese a bit. If I search something in Roman characters like "Hiroshi", I can get a number of hits. However, when I click on one of the links, it takes me to the splash page of the site and not to the actual page I was searching for. However, if I click on "cached" in the search results I do reach the page I am looking for. Do you have any idea why this might be happening?


Dave's Answer:

An interesting problem. We've been going back and forth for a few weeks on trying to figure out what's going on with the Japanese Google search so you can duplicate it on your site. The issue revolves around entry of non-Roman (romaji) letters in the search: that is, how do you enter Japanese kana (ideograms) or other non-English words? It's hard for me to test because, frankly, I don't really know much Japanese and I certainly have no clue how to produce Japanese words on my Mac keyboard.

But now you bring up a new wrinkle. Even with the standard romaji search, you're finding that the behavior that Google Japan is exhibiting is different to what you'd see if you did the very same search on the site itself.

Interesting!

What I suspect is that somehow the set of hidden parameters we're setting in the search box are incompatible with the Japanese version of Google's search system. Thing of it is, this is a pretty common issue when you're trying to reverse engineer a search box or similar, so I thought it'd make an interesting blog entry too.

Let's jump in!

First off, when I do a search from your site for "hiroshi", here's what I see:

google japan japanese search box 1

I get no results for that search, but looking closely at it, I think that the check box is "search our site only". I uncheck that, do another search, and, ahhh, here we go:

google japan japanese search box 2

Looks reasonable. Now, looking at the address bar, the mini-search box has produced the following URL:

http://www.google.com/search?q=hiroshi

Now if we do the same search from on Google itself, we see a far more complicated URL generated:

http://www.google.com/search?hl=en&q=hiroshi&btnG=Search&aq=f&aqi=&aql=&oq=

Now it's a process of elimination and testing. Obviously one of these values is what differentiates an on-Google search from one that's from your site, so adding one of them will fix the problem.

Immediately we can remove anything of the form name= without a value specified. That helps:

http://www.google.com/search?hl=en&q=hiroshi&btnG=Search&aq=f

Now from this point if you decide you want to duplicate every single value, the way to slip them into the search form is to specify the name=value pair as hidden variables. Like this:

<input type="hidden" name="hl" value="en" />

Put all of those between the <form> and </form> tags and when someone clicks on the search button the resultant URL should be completely and 100% identical to an on-Google search and the end result should match too.

Does it? I will note after all this explanation that I don't see the problem you detail: when I click on the search results page from a search directly off your site it works just fine. So that's weird.

Anyway, keep debugging, leave your latest results in the comment and hopefully others can be illuminated by the discussion.



Help others find this article at Del.icio.us, Digg, Netscape, Reddit, and Stumble Upon    

Subscribe!
Never miss another Q&A article! Click to subscribe: Add to Google Reader Add to My Yahoo! Subscribe in NewsGator RDF XML
Comments

Dear dave,
Its been over a month now that I'm unable to receive emails through my gmail and or facebook account. Please help
Marsha
Artistloves2compassion@gmail.com

Posted by: marsha at April 6, 2010 6:17 PM

I have something to say, now that you mention it, but ...
Starbucks coffee cup I do have a lot to say, and questions of my own for that matter, but first I'd like to say thank you for all your efforts on this Web site by buying you a cup of coffee!

I do have a comment, now that you mention it!











Remember personal info?


Please note that I will never send you any unsolicited email. Ever.

While I'm at it, please note that by submitting a question or comment you're agreeing to my terms of service, which are: you relinquish any subsequent rights of ownership to your material by submitting it on this site.






Recent Entries
Search
I Need Help!



Join The Club!
Sign up and get free weekly updates, news on my speaking schedule, seminars, workshops and more. It's cool. Just do it. :-)

© 2002 - 2010 by Dave Taylor. All Rights Reserved.

Note: This web site is for the purpose of disseminating information for educational purposes, free of charge, for the benefit of all visitors. We take great care to provide quality information. However, we do not guarantee, and accept no legal liability whatsoever arising from or connected to, the accuracy, reliability, currency or completeness of any material contained on this web site or on any linked site.

[whiteboard marker tray]
"Ask Dave Taylor®" is a registered trademark of Intuitive Systems, LLC.