Query language for python objects [closed]
Asked Answered
S

7

17

I'm looking for a library that lets me run SQL-like queries on python "object databases". With object database I mean a fairly complex structure of python objects and lists in memory. Basically this would be a "reverse ORM" - instead of providing an object oriented interface to a relational database, it would provide a SQL-ish interface to an object database.

C#'s LINQ is very close. Python's list comprehensions are very nice, but the syntax gets hairy when doing complex things (sorting, joining, etc.). Also, I can't (easily) create queries dynamically with list comprehensions.

The actual syntax could either be string based, or use a object-oriented DSL (a la from(mylist).select(...)). Bonus points if the library would provide some kind of indices to speed up search.

Does this exist or do I have to invent it?

Suneya answered 26/2, 2011 at 12:3 Comment(1)
Very good question. I'm curious if there's specific term for this.Gull
G
8

Dee is aiming to be SQL (ish; the author prefers relational calculus) for in-memory structures. There's a GROUP BY equivalent and everything.

Garlan answered 26/2, 2011 at 12:12 Comment(1)
I'm not sure if I can actually use this, but it looks fascinating. The relational aspect reminds me a little bit of Prolog.Suneya
G
3

We're launching PythonQL, which does exactly what you mention (its quite similar to C# LINQ). PythonQL demo site

Gromwell answered 14/10, 2016 at 19:20 Comment(0)
P
2

If you like list comprehensions, don't forget about their "lazy" counterpart: generator expressions. These should at least to some extent solve the question of dynamically constructing queries. Complement this with itertools, and of course some of the builtin functions that work on iterables, and you could say you have your Python "LINQ to Object" (or at least something very close).

You won't get "pseudo-sql" syntax built in in python as with C# and LINQ, but that is a matter of taste I guess. (personally, I like the fact that it remains Python, uses the same keywords etc. instead of trying to be sql: familiar if and for clauses instead of introducing where and from and so on).

The object-oriented DSL approach seems feasible (and easier to implement than string-based I think). You will find this sort of thing in ORMs like SqlAlchemy, but I don't know if anyone has done that already for "normal" data structures (I guess people don't really feel it's necessary?)

For indices and so on: I guess you'll have to search for actual "object databases"...

Peristyle answered 26/2, 2011 at 14:0 Comment(4)
I guess people don't really feel it's necessary? It could be very helpful. You can go even furhter and allow access to python object using XPath. Going one step further you could make it possible to do XSL transformations on Python objects...Gull
@Piotr: I agree that it could be helpful. It's just that it doesn't seem to exist (or at least not mainstream), while it should be possible, so... As for XSL transformations on Python objects: I'm not sure I want to go there ;-) (and I'm not an xslt hater, on the contrary, but sometimes, it's worth considering what the letters DSL stand for)Peristyle
I'm not sure I want to go there ;-) I know what you mean. It makes sense when you often have to deal with such tasks. Being in such a situation I see usefulness of these ideas.Gull
Yes, generator expressions are really nice in combination with itertools. Not just syntax-wise, but switching from [] to () can also save a lot of memory if done right.Suneya
M
2

One major difference between what SQL does and what you can do in idiomatic python, in SQL, you tell the evaluator what information you are looking for, and it works out the most efficient way of retrieving that based on the structure of the data it holds. In python, you can only tell the interpreter how you want the data, there's no equivalent to a query planner.

That said, there are a few extra tools above and beyond list comprehensions that help alot.

First, use a structure that closely resembles the declarative nature of SQL. Many of them are builtins. map, filter, reduce, zip, all, any, sorted, as well as the contents of the operator, functools and itertools packages, all offer a fairly concise way of expressing data queries.

Marvel answered 26/2, 2011 at 15:35 Comment(0)
I
2

Not quite exactly what you describe, but littletable works with lists of objects in memory, supports joins, pivots, queries. And each result is in turn another littletable, so you can easily perform a succession of filtering steps.

Impregnate answered 26/2, 2011 at 15:44 Comment(0)
I
1

See if sql4csv might be of help.

Impregnate answered 3/7, 2011 at 22:1 Comment(0)
S
1

I came across this as I was looking for a good library to support a similar use case. Turns out that Pandas works nicely for cases where join/group by type of queries are needed or for time series processing.

Secant answered 24/11, 2015 at 21:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.