Rosetta Code:Village Pump/RC extraction Tool and Task: Difference between revisions

Point to existing solution.
(Undo revision 214826 by 188.143.232.37 (talk))
(Point to existing solution.)
Line 59:
:: I'm not sure how much the Dual Licensing will hold up. Every page clearly states the contents are GDFL and not to contribute non-public domain works. If someone does, and it were found the contributor would have violated the license and the RC terms of use and presumably we would just remove the offending bits. --[[User:Dgamey|Dgamey]] 13:13, 27 May 2011 (UTC)
::: I agree, and I don't think it's really a worthwhile thing to consider unless someone did not read the directions. The key point I wanted to bring up is that RC content is GFDL, and that includes some frustrating restrictions. --[[User:Short Circuit|Michael Mol]] 17:08, 27 May 2011 (UTC)
 
==== Did it Already ====
 
I've been [http://nongnu.org/txr/rosetta-solutions.html doing this] for the TXR language for many years now. If you read the second paragraph in the navigation pane, you will notice that it links to the pair of scripts for generating the page.
 
One script scrapes the examples over HTTP; it works by navigating edit links and getting the actual markup source code inside the examples. The generation script parses the markup, and implements a lot of the formatting. It extracts the TXR code and applies syntax coloring to it with the help of the Vim editor. The [http://nongnu.org/txr/highlight.exp <code>highlight.exp</code>] expect script for invoking Vim noninteractively isn't linked to from the page, but here it is.
 
From time to time (once in a blue moon) I run the fetching script. Then do a <code>diff</code> between the newly downloaded file and the previously downloaded one. If there are any changes, I replace the stable copy and run the second script to regenerate the page.
 
There is another reason why I follow this procedure. There are some race conditions. The edit links which are chased from the task page to the edit of a particular solution are numerically indexed. If someone inserts a solution, the numbering changes. I've also seen strange bugs where the script repeatedly fetches the wrong task for the wrong language, even when re-run. The URL it is using is fetching stale data somehow; but the issue doesn't reproduce with a browser. If we fetch exactly the same URL in a browser that the script is using, the browser gets the correct data. This happens rarely; it's some sort of strange caching or something. In any case, with the "diff against sane previous copy" approach, I catch these things.
 
[[User:Kazinator|Kazinator]] ([[User talk:Kazinator|talk]]) 00:13, 30 October 2018 (UTC)
543

edits