![]() |
How can I debug a Japanese-language Google search?On our site: soysource.net I have added the sample code from your website (see Add a Google Search to your Website) to the bottom of ours, I customized it into Japanese a bit. If I search something in Roman characters like "Hiroshi", I can get a number of hits. However, when I click on one of the links, it takes me to the splash page of the site and not to the actual page I was searching for. However, if I click on "cached" in the search results I do reach the page I am looking for. Do you have any idea why this might be happening? An interesting problem. We've been going back and forth for a few weeks on trying to figure out what's going on with the Japanese Google search so you can duplicate it on your site. The issue revolves around entry of non-Roman (romaji) letters in the search: that is, how do you enter Japanese kana (ideograms) or other non-English words? It's hard for me to test because, frankly, I don't really know much Japanese and I certainly have no clue how to produce Japanese words on my Mac keyboard. But now you bring up a new wrinkle. Even with the standard romaji search, you're finding that the behavior that Google Japan is exhibiting is different to what you'd see if you did the very same search on the site itself. Interesting! What I suspect is that somehow the set of hidden parameters we're setting in the search box are incompatible with the Japanese version of Google's search system. Thing of it is, this is a pretty common issue when you're trying to reverse engineer a search box or similar, so I thought it'd make an interesting blog entry too. Let's jump in! First off, when I do a search from your site for "hiroshi", here's what I see: ![]() I get no results for that search, but looking closely at it, I think that the check box is "search our site only". I uncheck that, do another search, and, ahhh, here we go: ![]() Looks reasonable. Now, looking at the address bar, the mini-search box has produced the following URL: http://www.google.com/search?q=hiroshi
Now if we do the same search from on Google itself, we see a far more complicated URL generated: http://www.google.com/search?hl=en&q=hiroshi&btnG=Search&aq=f&aqi=&aql=&oq=
Now it's a process of elimination and testing. Obviously one of these values is what differentiates an on-Google search from one that's from your site, so adding one of them will fix the problem. Immediately we can remove anything of the form name= without a value specified. That helps: http://www.google.com/search?hl=en&q=hiroshi&btnG=Search&aq=f
Now from this point if you decide you want to duplicate every single value, the way to slip them into the search form is to specify the name=value pair as hidden variables. Like this: <input type="hidden" name="hl" value="en" />
Put all of those between the <form> and </form> tags and when someone clicks on the search button the resultant URL should be completely and 100% identical to an on-Google search and the end result should match too. Does it? I will note after all this explanation that I don't see the problem you detail: when I click on the search results page from a search directly off your site it works just fine. So that's weird. Anyway, keep debugging, leave your latest results in the comment and hopefully others can be illuminated by the discussion.
Help others find this article at Del.icio.us, Digg, Netscape, Reddit, and Stumble Upon
Categorized:
CGI Scripts and Web Site Programming
(Article 9343)
Tagged: google, google japan, html programming, reverse engineering, search engines Previous: How do I find an old Wall posting on Facebook? Next: Has my Hotmail account been hacked? View Mobile Version Subscribe!
Dear dave, I have something to say, now that you mention it, but ...
I do have a comment, now that you mention it!
|
Recommended
Recent Entries
Search
I Need Help!
Apple iPad Help
Articles and Reviews Auctions and Online Shopping Blogs and RSS Feeds Building Web site traffic Business and Management Cell Phones and Mobile Phones CGI Scripts and Web Site Programming Computer and Internet Basics d) None of the Above Facebook Help HTML and CSS Industry News and Trade Shows Mac OS X Help MySpace, Twitter and Social Network Help Pay Per Click (PPC) Search Engine Optimization Shell Script Programming Sony PSP, MP3 Players, Etc. The Writing Business Unix and Linux Help Video Game Tips and Help Windows Help |