Web scraping: Difference between revisions
Content added Content deleted
(→Naive: Likewise.) |
m (→Robust: FIx unnecessary use of quasiliteral to ordinary string literal.) |
||
Line 1,668: | Line 1,668: | ||
If the web page changes too much, the query will fail to match. TXR will print the word "false" and terminate with a failed exit status. This is preferrable to finding a false positive match and printing a wrong result. (E.g. any random garbage that happened to be in a line of HTML accidentally containing the string UTC). |
If the web page changes too much, the query will fail to match. TXR will print the word "false" and terminate with a failed exit status. This is preferrable to finding a false positive match and printing a wrong result. (E.g. any random garbage that happened to be in a line of HTML accidentally containing the string UTC). |
||
<lang txr>@(next @(open-command |
<lang txr>@(next @(open-command "wget -c http://tycho.usno.navy.mil/cgi-bin/timer.pl -O - 2> /dev/null")) |
||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final"//EN> |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final"//EN> |
||
<html> |
<html> |