How to write my own dialect in sqlalchemy to adapt HTTP API?
Asked Answered
H

2

6

I'm trying to add a special data source to Superset (a data exploration platform). This database only supports HTTP API, and returns data in json format; for example:

> http://localhost/api/sql/query?q="select * from table"
< [{"id": 1, "value":10}, {"id": 2, "value": 30} ...]

Therefore, I have to write my own adapter in python SQLAlchemy for Superset. I have read the document and part of source code, but still need good example to follow.

Hurlow answered 12/5, 2017 at 4:8 Comment(0)
S
5

(Solution edited into the question by the OP)

I have solved this problem. Here is what I do.

  1. Go to ./site-packages/sqlalchemy/dialects

  2. Copy any concrete dialects to the new one (eg: named zeta) as the start point. A better way is to use

    from sqlalchemy.engine.default import DefaultDialect
    class ZetaDialect(DefaultDialect):
        ...
    
  3. Add zeta into __all__ section of ./site-packages/sqlalchemy/dialects/__init__.py

  4. Create a test program:

    from sqlalchemy import create_engine
    engine = create_engine('zeta://XXX')
    result = engine.execute("select * from table_name")
    for row in result:
        print(row)
  1. Run it and get error. Use pdb to find the reason. In most cases, the reason is NotImplement some interfaces. Solve it one by one.

  2. When test program gives correct answer, it has almost been done 90%. For completeness, we should also implement several interface used by inspectors:

    class ZetaDialect(DefaultDialect):
        # default_paramstyle = 'qmark'
        name = 'zeta'

        def __init__(self, **kwargs):
            DefaultDialect.__init__(self, **kwargs)

        @classmethod
        def dbapi(cls):
            return zeta_dbapi

        @reflection.cache
        def get_table_names(self, connection, schema=None, **kw):
            return [u'table_1', u'table_2', ...]

        @reflection.cache
        def get_pk_constraint(self, connection, table_name, schema=None, **kw):
            return []

        @reflection.cache
        def get_foreign_keys(self, connection, table_name, schema=None, **kw):
            return []

        @reflection.cache
        def get_unique_constraints(self, connection, table_name,
                                   schema=None, **kw):
            return []

        @reflection.cache
        def get_indexes(self, connection, table_name, schema=None, **kw):
            return []

        @reflection.cache
        def get_schema_names(self, connection, **kw):
            return []

        @reflection.cache
        def get_columns(self, connection, table_name, schema=None, **kw):
            # just an example of the column structure
            result = connection.execute('select * from %s limit 1' % table_name)
            return [{'default': None, 'autoincrement': False, 'type': TEXT, 'name': colname, 'nullable': False} for colname, coltype in result.cursor.description]
Saltire answered 12/5, 2017 at 4:8 Comment(1)
Instead of 3. rather use from sqlalchemy.dialects import registry registry.register("zeta", "myapp.dialect", "ZetaDialect")Vivid
B
1

I wrote a Python library called shilellagh for this exact purpose — connecting Apache Superset to APIs and other non-SQL data sources.

Here are the docs on creating a new API interface (called an "adapter" in shillelagh) and creating a new SQLAlchemy dialect.

Baun answered 23/4 at 13:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.