Talk:Yahoo! search interface: Difference between revisions

no edit summary
No edit summary
Line 16:
:: I could turn my [http://paddy3118.blogspot.com/2009/02/extended-vanity-search-on-rosetta-code.html Vanity Search] blog entry into an RC task, but don't we already have a task that extracts info from RC stats pages? Maybe we should just let this other "search a web site that needs user input and gives answers on multiple pages" type task just die? --[[User:Paddy3118|Paddy3118]] 12:56, 3 May 2009 (UTC)
::: Actually, I'd prefer to avoid too many more tasks that pull from MediaWiki-supplied content; They're unkind to the DB backend, and my slice doesn't have enough RAM for memcached or squid to be particularly helpful when dealing with a high request-per-minute rate; apache processes fill 256MB pretty quick when they pile up waiting for a response from MySQL, which gets slower as it gets starved for RAM. (I can't afford to spend more on RC server performance unless RC can pay for itself.) If scraping and list navigation is the ultimate goal, something much more lightweight can be provided, that doesn't even touch MySQL. --[[User:Short Circuit|Short Circuit]] 15:40, 3 May 2009 (UTC)
 
 
Google's <cite>Use of services by you</cite> seems clear about the ''usage''; of course they can't prohibit to write such scripts. Nonetheless, if we drop here a code, likely it will be tested (I like to know if a code work or not... not just by reading at it, but executing it), and this violates that paragraph. Which sounds rather odiosus (I've realized now I've violated it a lot of times with ''ad hoc'' LWP perl scripts, mechanize libraries, wget and curl... after all the HTTP headers are all they can use to "know" if it's a "real browser" or not), but after all the task maybe is not worth the risk for RC.
 
I've not read the task requirements. But I believe there can exist search engines that have not a paragraph like that and allow to be used even not with their interface; and after all, if the point was just to show how to use a search engine and analyze its results, we should not be interested particularly in using the Google indexes... if someone else is, it can use the knowledge here gained at his/her own risk. --[[User:ShinTakezou|ShinTakezou]] 15:47, 3 May 2009 (UTC)
 
Google API token isn't a thing easy to get. But some languagaes (Like .NET) have a real browser User-Agent, such as "Mozilla/5.0 (Firefox 3.0; X11)", other languges, like Python, you need to "hack" because user-agent is "Python/urllib2.0". Searching Google, sometimes maybe very util for some people. I think that putting a disclaimer in top of page is a nice solution. --[[User:Guga360|Guga360]]
Anonymous user