How to write my own dialect in sqlalchemy to adapt HTTP API?

H

2

6

I'm trying to add a special data source to Superset (a data exploration platform). This database only supports HTTP API, and returns data in json format; for example:

> http://localhost/api/sql/query?q="select * from table"
< [{"id": 1, "value":10}, {"id": 2, "value": 30} ...]

Therefore, I have to write my own adapter in python SQLAlchemy for Superset. I have read the document and part of source code, but still need good example to follow.

Hurlow answered 12/5, 2017 at 4:8 Comment(0)

S

5

(Solution edited into the question by the OP)

I have solved this problem. Here is what I do.

Go to ./site-packages/sqlalchemy/dialects
Copy any concrete dialects to the new one (eg: named zeta) as the start point. A better way is to use
```
from sqlalchemy.engine.default import DefaultDialect
class ZetaDialect(DefaultDialect):
    ...
```
Add zeta into __all__ section of ./site-packages/sqlalchemy/dialects/__init__.py
Create a test program:

    from sqlalchemy import create_engine
    engine = create_engine('zeta://XXX')
    result = engine.execute("select * from table_name")
    for row in result:
        print(row)

Run it and get error. Use pdb to find the reason. In most cases, the reason is NotImplement some interfaces. Solve it one by one.
When test program gives correct answer, it has almost been done 90%. For completeness, we should also implement several interface used by inspectors:

    class ZetaDialect(DefaultDialect):
        # default_paramstyle = 'qmark'
        name = 'zeta'

        def __init__(self, **kwargs):
            DefaultDialect.__init__(self, **kwargs)

        @classmethod
        def dbapi(cls):
            return zeta_dbapi

        @reflection.cache
        def get_table_names(self, connection, schema=None, **kw):
            return [u'table_1', u'table_2', ...]

        @reflection.cache
        def get_pk_constraint(self, connection, table_name, schema=None, **kw):
            return []

        @reflection.cache
        def get_foreign_keys(self, connection, table_name, schema=None, **kw):
            return []

        @reflection.cache
        def get_unique_constraints(self, connection, table_name,
                                   schema=None, **kw):
            return []

        @reflection.cache
        def get_indexes(self, connection, table_name, schema=None, **kw):
            return []

        @reflection.cache
        def get_schema_names(self, connection, **kw):
            return []

        @reflection.cache
        def get_columns(self, connection, table_name, schema=None, **kw):
            # just an example of the column structure
            result = connection.execute('select * from %s limit 1' % table_name)
            return [{'default': None, 'autoincrement': False, 'type': TEXT, 'name': colname, 'nullable': False} for colname, coltype in result.cursor.description]

Saltire answered 12/5, 2017 at 4:8 Comment(1)

Instead of 3. rather use from sqlalchemy.dialects import registry registry.register("zeta", "myapp.dialect", "ZetaDialect") – Vivid 23/5, 2023 at 8:2

B

1

I wrote a Python library called shilellagh for this exact purpose — connecting Apache Superset to APIs and other non-SQL data sources.

Here are the docs on creating a new API interface (called an "adapter" in shillelagh) and creating a new SQLAlchemy dialect.

Baun answered 23/4 at 13:3 Comment(0)

Recommended topics

Hot tags