Web scraping: Difference between revisions
Content added Content deleted
m (→{{header|Phix}}: added libheader) |
(→{{header|Raku}}: updated raku programming solution ( with a new data source )) |
||
Line 1,693: | Line 1,693: | ||
=={{header|Raku}}== |
=={{header|Raku}}== |
||
(formerly Perl 6) |
(formerly Perl 6) |
||
<lang perl6># 20210301 Updated Raku programming solution |
|||
⚫ | |||
⚫ | |||
#`[ Site inaccessible since 2019 ? |
|||
my $site = "http://tycho.usno.navy.mil/cgi-bin/timer.pl"; |
my $site = "http://tycho.usno.navy.mil/cgi-bin/timer.pl"; |
||
HTTP::Client.new.get($site).content.match(/'<BR>'( .+? <ws> UTC )/)[0].say |
HTTP::Client.new.get($site).content.match(/'<BR>'( .+? <ws> UTC )/)[0].say |
||
# ] |
|||
my $site = "https://www.utctime.net/"; |
|||
my $matched = HTTP::Client.new.get($site).content.match( |
|||
/'<td>UTC</td><td>'( .*Z )'</td>'/ |
|||
)[0]; |
|||
say $matched; |
|||
#$matched = '12321321:412312312 123'; |
|||
with DateTime.new($matched.Str) { |
|||
say 'The fetch result seems to be of a valid time format.' |
|||
} else { |
|||
CATCH { put .^name, ': ', .Str } |
|||
}</lang> |
|||
Note that the string between '<' and '>' refers to regex tokens, so to match a literal '<BR>' you need to quote it, while <ws> refers to the built-in token whitespace. |
Note that the string between '<' and '>' refers to regex tokens, so to match a literal '<BR>' you need to quote it, while <ws> refers to the built-in token whitespace. |
||
Also, whitespace is ignored by default in Raku regexes. |
Also, whitespace is ignored by default in Raku regexes. |
||
{{out}} |
|||
<pre> |
|||
「2021-03-01T17:02:37Z」 |
|||
The fetch result seems to be of a valid time format. |
|||
</pre> |
|||
=={{header|REBOL}}== |
=={{header|REBOL}}== |