Talk:Web scraping: Difference between revisions

Content added Content deleted
No edit summary
Line 4: Line 4:
== Criticism ==
== Criticism ==


The task, as described and the examples so far are extremely weak by comparison to one's normal expectations of what "web scraping" means. The examples simple pull a page and extract a line of text using simple regular expressions.
The task, as described and the examples so far are extremely weak by comparison to one's normal expectations of what "web scraping" means. The examples just pull a page and extract a line of text using simple regular expressions.


When developers talk about "web scraping" they are usually talking about much more than simply fetching the page and doing trivial extraction of a simple regular expression. Usually the task implies more sophisticated parsing of the page's HTML and frequently involves encoding the request into a query string (ReSTful sites) or an HTTP POST-able form.
When developers talk about "web scraping" they are usually talking about much more than simply fetching the page and doing trivial extraction of a simple regular expression. Usually the task implies more sophisticated parsing of the page's HTML and frequently involves encoding the request into a query string (ReSTful sites) or an HTTP POST-able form.