libxml2

Language: C

XML / Parsing

libxml2 was developed as part of the GNOME project and has become the standard library for XML parsing in C applications. It is widely used in software ranging from desktop applications to web services that require robust XML processing.

libxml2 is a C library for parsing XML documents. It provides a comprehensive set of functions for reading, validating, navigating, and manipulating XML data efficiently, and supports standards like XPath, XInclude, and XPointer.

Installation

linux: sudo apt install libxml2-dev
mac: brew install libxml2
windows: Download precompiled binaries from http://xmlsoft.org/sources/

Usage

libxml2 allows developers to parse XML documents either in memory or from files, traverse XML trees, extract information with XPath, validate against DTD or XML Schema, and modify XML content programmatically.

Parsing an XML file

#include <libxml/parser.h>
#include <libxml/tree.h>
#include <stdio.h>

int main() {
    xmlDoc *doc = xmlReadFile("example.xml", NULL, 0);
    if (doc == NULL) {
        printf("Failed to parse XML\n");
        return 1;
    }
    xmlFreeDoc(doc);
    xmlCleanupParser();
    return 0;
}

Reads an XML file into memory and checks for successful parsing.

Accessing root element

xmlNode *root = xmlDocGetRootElement(doc);
printf("Root element: %s\n", root->name);

Retrieves the root element of an XML document and prints its tag name.

Iterating over child nodes

for(xmlNode *node = root->children; node; node = node->next) {
    if(node->type == XML_ELEMENT_NODE)
        printf("Node name: %s\n", node->name);
}

Loops through all child nodes of the root element and prints element names.

Using XPath

#include <libxml/xpath.h>
xmlXPathContextPtr xpathCtx = xmlXPathNewContext(doc);
xmlXPathObjectPtr xpathObj = xmlXPathEvalExpression((xmlChar*)"//book", xpathCtx);
for(int i=0; i < xpathObj->nodesetval->nodeNr; i++) {
    xmlNodePtr node = xpathObj->nodesetval->nodeTab[i];
    printf("Book node: %s\n", node->name);
}
xmlXPathFreeObject(xpathObj);
xmlXPathFreeContext(xpathCtx);

Uses XPath to select all `<book>` elements in the document.

Modifying XML nodes

xmlNodePtr newNode = xmlNewChild(root, NULL, (xmlChar*)"author", (xmlChar*)"John Doe");

Adds a new child node `<author>` with text content to the root element.

Validating against a DTD

xmlValidCtxtPtr ctxt = xmlNewValidCtxt();
int ret = xmlValidateDocument(ctxt, doc);
xmlFreeValidCtxt(ctxt);

Validates the XML document against a DTD and returns success or failure.

Error Handling

xmlReadFile returns NULL: Check that the file exists, is readable, and contains valid XML.
Invalid XPath expression: Ensure the XPath syntax is correct and matches the XML structure.
Memory leaks: Always free documents, nodes, and contexts after use.

Best Practices

Always call `xmlCleanupParser()` before exiting to release memory.

Check return values when parsing or modifying XML to handle errors.

Use UTF-8 encoding for XML strings to avoid character issues.

Free documents with `xmlFreeDoc()` after use to prevent memory leaks.

Use XPath for efficient element selection instead of manual tree traversal.