Issue 741029: HTMLParser -- possible bug in handle_comment

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/38533

classification

Title:	HTMLParser -- possible bug in handle_comment
Type:		Stage:
Components:	Library (Lib)	Versions:	Python 2.2

process

Created on 2003-05-21 10:35 by scott_israel, last changed 2022-04-10 16:08 by admin. This issue is now closed.

Messages (2)
msg16093 - (view)	Author: Scott Israel (scott_israel)	Date: 2003-05-21 10:35
>>> import HTMLParser >>> class Parser(HTMLParser.HTMLParser): def __init__(self): HTMLParser.HTMLParser.__init__ (self) def handle_data(self,data): print 'DATA: %s' % data def handle_comment(self,comment): print 'COMMENT: %s' % comment >>> test3='<STYLE><!-- This is a comment --> </STYLE>' >>> p=Parser() >>> p.feed(test3) DATA: <!-- This is a comment --> Is this a bug?
msg16094 - (view)	Author: Grant Olson (logistix)	Date: 2003-05-21 20:04
Logged In: YES user_id=699438 No, <style> is one of the tags that uses CDATA to make comments irrelevant. This was done so to 'enable legacy support' by allowing authors to write: <style> <!-- body{dd:00;} --> </style> Without the comments, most legacy browsers would display the text "body{dd:00;}" on the rendered webpage. HTML Spec reference is here: http://www.w3.org/TR/html4/present/styles.html#h-14.5