Evaluate Xpath2.0 in python
Asked Answered
O

2

13

I have an XPath expression as shown below.

if(replace(//p[1]/text(),'H','h') = 'hello') then //p[1]/text() else if(//p[1]/text() = 'world') then //p[2]/text() else 'notFound'

I want to display which 'if ' expression worked.

e.g //p[1]/text() if first 'if' expression worked.

'If' expression can have nested if, for loops and xpath2.0 functions.

I can't find any xpath2.0 library for python. So I tried to convert this Js library to python still I can able to split xpath2.0 expression to lexers but can't convert it fully to python.

Suggest me some Xpath2.0 library for python, if available. Also how to interpret XPath expression and display which part of the expression worked?

Orms answered 8/10, 2018 at 17:37 Comment(3)
lxml.de/xpathxslt.html#xpath is a fine library, or simply docs.python.org/3.7/library/…Perishable
Ya i tried it. It supports only xpath1.0 expression and I extend it to support xpath 2.0 functions like replace, tokenize but 'if' and 'for' expression can't be evaluated.Orms
Saxon 9.8 supports XPath 3 and 2 and is available in a Saxon/C version at saxonica.com/download/c.xml, so as other libraries for Python are written in C it might be possible to build one for Python based on Saxon/C, at least for XPath 2/3 evaluation, not sure how far you would get to dig into the XPath implementation.Messinger
R
14

As you already know, lxml, the cornerstone of Python's XML/XPath support, features only

XPath 1.0, XSLT 1.0 and the EXSLT extensions through libxml2 and libxslt

We still have some options.

I researched this topic recently (specifically, Python's XQuery support).
See the W3C for a reference list of XML Query Implementations.

  1. Python Modules with XPath 2+ and EXSLT Extensions (e.g. an EXSLT for regex matching)
    There are some modules on PiPy that partially offer XPath 2.0+ functionality.

  2. There are some OSS XML/NoSQL-DBMS that implement XPath/XQuery 2.0 functions, e.g.

    • Zorba, an open source portable embeddable C++ implementation of XQuery 1.0/2.0 which has Python bindings (this question has some pointers),
    • as well as Sedna and some commercial DBMS. Depending on your project, this could be a good choice.
  3. I believe Saxon/C (by Michael Kay) with Cython is the most promising road.
    It was tried before using Boost.Python and at pysaxon.
    Update: A Saxon/C extension for Python 3 has been published meanwhile.

  4. You could use a subprocess to call a CLI XML processor (as suggested here), e.g. subprocess.call(["saxon", "-o:output.xml", "-s:file.xml", "file.xslt"])

  5. Another option is using XSLT/XPath/XQuery with saxon and/or other Java XML classes in Jython.

  6. Finally, you could set-up a web service that does the hard work for you in a language like Java, .NET, etc, that comes with proper XPath 3+ support (also suggested by Kay here).

Still somewhat disappointing, especially for a big language like Python.

Revisionism answered 7/11, 2018 at 18:34 Comment(0)
R
3

As mentioned by Martin we have a Saxon product for C/C++/PHP languages, called Saxon/C which out for a few years now. We have been seeing users interested in using Saxon/C with Python. 

Update 2019 A Saxon/C extension for Python3 has now been released : SaxonC - XML document processing for C/C++, PHP and Python

One user has successfully used Boost.Python to interface with our C++ library.  Another user has done the interfacing in a different way: https://github.com/ajelenak/pysaxon

Rilke answered 3/1, 2019 at 23:14 Comment(2)
An official Saxon/C interface for Python would be awesome.Exultant
We have now release an Saxon/C extension for Python3: saxonica.com/saxon-c/index.xmlRilke

© 2022 - 2024 — McMap. All rights reserved.