python - ElementTree namespace incovenience -
i can't control quality of xml get. in cases is:
<collada xmlns="http://www.collada.org/2005/11/colladaschema" version="1.4.1"> ... </collada>
in others get:
<collada>...</collada>
and guess should handle
<collada:collada xmlns:collada="http://www.collada.org/2005/11/colladaschema"> ... </collada:collada>
it's same schema over, , need 1 parser process it. how can handle these cases? need xpath , other lxml goodies through this. how make consistent during etree.parse time? don't want check on namespaces every time need use xpath.
my usual recommendation preprocess first, normalize namespaces. has 2 benefits: normalization code highly reusable, because doesn't depend on how data being processed subsequently; , logic process data considerably simplified.
if documents use 1 namespace, or none, , not use qualified names in content of text or attribute nodes, transformation achieve normalization simple:
<xsl:template match="*"> <xsl:element name="local-name()" namespace="http://www.collada.org/2005/11/colladaschema"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template>
Comments
Post a Comment