Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128136: Possibility to customize hardcoded xml declaration #128095

Closed
wants to merge 12 commits into from
48 changes: 48 additions & 0 deletions Lib/test/test_xml_etree.py
Original file line number Diff line number Diff line change
Expand Up @@ -4021,6 +4021,54 @@ def test_write_to_user_binary_writer_with_bom(self):
'''<?xml version='1.0' encoding='utf-16'?>\n'''
'''<site />'''.encode("utf-16"))

def test_custom_declaration_to_user_binary_writer_with_bom(self):
tree = ET.ElementTree(ET.XML('''<site />'''))
raw = io.BytesIO()
writer = self.dummy()
writer.write = raw.write
writer.seekable = lambda: True
writer.tell = raw.tell
tree.write(writer, encoding="utf-16", xml_declaration_definition='<?xml version="{version}" encoding="{encoding}"?>')
self.assertEqual(raw.getvalue(),
'''<?xml version="1.0" encoding="utf-16"?>\n'''
'''<site />'''.encode("utf-16"))

def test_custom_declaration2_to_user_binary_writer_with_bom(self):
tree = ET.ElementTree(ET.XML('''<site />'''))
raw = io.BytesIO()
writer = self.dummy()
writer.write = raw.write
writer.seekable = lambda: True
writer.tell = raw.tell
tree.write(writer, encoding="utf-16", xml_declaration_definition='<?xml version="1.1" encoding="{encoding}"?>')
self.assertEqual(raw.getvalue(),
'''<?xml version="1.1" encoding="utf-16"?>\n'''
'''<site />'''.encode("utf-16"))

def test_custom_declaration3_to_user_binary_writer_with_bom(self):
tree = ET.ElementTree(ET.XML('''<site />'''))
raw = io.BytesIO()
writer = self.dummy()
writer.write = raw.write
writer.seekable = lambda: True
writer.tell = raw.tell
tree.write(writer, encoding="utf-16", xml_declaration_definition='<?xml version="1.1" encoding="UTF-8"?>')
self.assertEqual(raw.getvalue(),
'''<?xml version="1.1" encoding="UTF-8"?>\n'''
'''<site />'''.encode("utf-16"))

def test_custom_declaration4_to_user_binary_writer_with_bom(self):
tree = ET.ElementTree(ET.XML('''<site />'''))
raw = io.BytesIO()
writer = self.dummy()
writer.write = raw.write
writer.seekable = lambda: True
writer.tell = raw.tell
tree.write(writer, encoding="utf-16", xml_declaration_definition='<?xml version="1.0"?>')
self.assertEqual(raw.getvalue(),
'''<?xml version="1.0"?>\n'''
'''<site />'''.encode("utf-16"))

def test_tostringlist_invariant(self):
root = ET.fromstring('<tag>foo</tag>')
self.assertEqual(
Expand Down
15 changes: 13 additions & 2 deletions Lib/xml/etree/ElementTree.py
Original file line number Diff line number Diff line change
Expand Up @@ -679,6 +679,7 @@ def iterfind(self, path, namespaces=None):
def write(self, file_or_filename,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should add a unit test to test it's behavior. Verify it whether the requirements are met.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some unit tests but there is'nt an issu to add an issue title.

and i think the change has little impact on Python users.

encoding=None,
xml_declaration=None,
xml_declaration_definition="<?xml version='{version}' encoding='{encoding}'?>",
picnixz marked this conversation as resolved.
Show resolved Hide resolved
default_namespace=None,
method=None, *,
short_empty_elements=True):
Expand All @@ -694,6 +695,14 @@ def write(self, file_or_filename,
is added if encoding IS NOT either of:
US-ASCII, UTF-8, or Unicode

*xml_declaration_definition* -- string for customizing encoding declaration
as documentation always shows doublequotes but
singlequote where hardcoded here.
to be rfc conform which allows doublequotes and
singlequote for declaration. default value is
untouched to pass tests.
placeholders: {version}, {encoding}

*default_namespace* -- sets the default XML namespace (for "xmlns")

*method* -- either "xml" (default), "html, "text", or "c14n"
Expand All @@ -719,8 +728,10 @@ def write(self, file_or_filename,
(xml_declaration is None and
encoding.lower() != "unicode" and
declared_encoding.lower() not in ("utf-8", "us-ascii"))):
write("<?xml version='1.0' encoding='%s'?>\n" % (
declared_encoding,))
# version configuration is'nt necessary, can be overwritten
# in declaration_definition at runtime
data = {'version':'1.0', 'encoding': declared_encoding}
write(xml_declaration_definition.format(**data)+"\n")
if method == "text":
_serialize_text(write, self._root)
else:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
New parameter added to the write function that allows you to customize the xml declaration, which was previously hard-coded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned, the NEWS entry must be shorter (one sentence in general).
Examples should be put in the documentation instead. In addition, a What's New entry must be created for new features.

The standard functionality is retained; if you don't specify the xml_declaration_definition parameter, the previous standard declaration is used. ``<?xml version='1.0' encoding='{encoding}'?>``
When specifying the parameter, the following options are available.
E.g.:
tree.write(file, encoding="utf-8", xml_declaration=True, xml_declaration_definition='''<?xml version="{version}" encoding="{encoding}"?>''')
tree.write(file, encoding="utf-8", xml_declaration=True, xml_declaration_definition='''<?xml version="1.1" encoding="{encoding}"?>''')
tree.write(file, encoding="utf-8", xml_declaration=True, xml_declaration_definition='''<?xml version="1.1"?>''')
The placeholders {version} and {encoding} are replaced, version is always 1.0 and enconding depends on the code as before.
Loading