User talk:Bukzor: Difference between revisions

add link to stackoverflow question
m (fix link)
(add link to stackoverflow question)
Line 11:
::Cool. If you know some more, mark em and I'll take a crack at it. --[[User:Bukzor|Bukzor]] 21:06, 19 April 2010 (UTC)
::: Use the MediaWiki API to grab a list of the pages in [[:Category:Python]]. Those will have Python code. Then grab the HTML for those pages, put them into a DOM. The way I have the syntax highlighting set up, any use of the lang tag will apply the language keyword as a CSS class. You should be able to select for "python" as a CSS class, and get lumps of Python code. I bet you could automate feeding that code through pylint and having it save you a report of pages->scores, so you instantly know where the most problematic code is. --[[User:Short Circuit|Michael Mol]] 01:06, 20 April 2010 (UTC)
::::I really like that idea. The mediawiki API is pretty straightforward. I feel done with that part, but I'm having trouble getting any of the builtin html or xml parsers to give me a DOM. [http://docs.python.org/library/htmlparser.html htmlparser] is just a ghetto little state machine, and the xml parsers are too strict (  is an 'unknown entity').
::::I've posted a stackoverflow question on this subject [http://stackoverflow.com/questions/2676872/how-to-parse-malformed-html-in-python-using-standard-libraries here].
 
:I wish someone would check my VBScript. Am I the only person on this thing who still uses it? -- [[User:Axtens|Axtens]] 04:01, 20 April 2010 (UTC)
Anonymous user