This function reads an XML file as plain text, removes duplicate attributes within XML tags, and parses the cleaned XML using the `xml2::read_xml` function. For each tag, only the first occurrence of each attribute is retained, and any subsequent duplicate attributes are removed. The cleaned XML is then returned as an `xml_document` object.

readXML_remove_duplicate_attributes(xml_file)

Arguments

xml_file

A character string representing the path to the XML file to be read and cleaned.

Value

An `xml_document` object representing the cleaned XML with duplicate attributes removed.

Details

The function processes the XML file in three main steps: 1. Reads the XML file as plain text and combines it into a single string. 2. For each XML tag, the function identifies and removes any duplicate attributes, retaining only the first occurrence of each attribute. 3. The cleaned XML string is parsed and returned as an `xml_document` object using `xml2::read_xml`.

The function is designed to handle the violation of XML specifications where tags may contain duplicated attribute names.

Examples

# Example usage:
# xml_doc <- readXML_remove_duplicate_attributes("path_to_file.xml")