Python wheel package build and install with cython binary .so files only and config, resource folders
Asked Answered
T

1

0

My code structure is as follows:

myMLCode 
│  
├── main.py
├── ML_lib
|   ├── __init__.py
│   └── core.py
|   └── set1
|       ├── __init__.py
│       └── mymod1.py
│       └── mymod2.py
|   └── set2
|       ├── __init__.py
│       └── mymod3.py
│       └── mymod4.py
├── config
│   ├── config1.yml
│   └── config2.yml
├── models
│   ├── model1.h5
│   └── model2.h5
├── setup.py 

What I would like to do is to make a wheel file using the cythonized code from this whole package and be able to run the code seamlessly.

Expectation is to run with python main.py Plus I want to edit the config files and update the model files from time to time and continue to use the package.

What I managed to do so far is with the following setup.py file:

from Cython.Distutils import build_ext
from Cython.Build import cythonize
from setuptools.extension import Extension
from setuptools.command.build_py import build_py as build_py_orig
from pathlib import Path
from setuptools import find_packages, setup, Command
import os
import shutil


class MyBuildExt(build_ext):
    def run(self):
        build_ext.run(self)

        build_dir = Path(self.build_lib)
        root_dir = Path(__file__).parent

        target_dir = build_dir if not self.inplace else root_dir

        self.copy_file('ML_lib/__init__.py', root_dir, target_dir)
        self.copy_file('ML_lib/set1/__init__.py', root_dir, target_dir)
        self.copy_file('Ml_lib/set2/__init__.py', root_dir, target_dir)

def copy_file(self, path, source_dir, destination_dir):
    if not (source_dir / path).exists():
        return

    shutil.copyfile(str(source_dir / path), str(destination_dir / path))

extensions = [
    Extension("core", ["core.py"]),
    Extension("ML_lib.set1.*", ["ML_lib/set2/*.py"]),
    Extension("ML_lib.set2.*", ["ML_lib/set2/*.py"]),
    Extension("ML_lib.*", ["ML_lib/*.py"]),
]

setup(
    name="myMLCode",
    version="0.0.1",
    author="myself",
    description="This is compiled ML code",
    ext_modules=cythonize(
        extensions,
        build_dir="build",
        compiler_directives=dict(
        always_allow_keywords=True
        )),
    data_files=[
        ('config',['config/config1.yml']),
        ('config',['config/config2.yml']),
        ('models',['models/model1.h5']),
        ('models',['models/model2h5']),
    ],
    cmdclass={
        'build_ext': MyBuildExt
    },
    entry_points={
    },
)

This makes a wheel file which contains the following:

myMLCode-0.0.1-cp37-cp37m-linux_x86_64.whl
------------------------------------------
 main.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/__init__.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/__init__.py'
'ML_lib/core.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/mymod1.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/mymod2.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/__init__.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/__init__.py'
'ML_lib/set2/mymod3.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set2/mymod4.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set2/__init__.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set2/__init__.py'
'myMLCode-0.0.1.data/data/config/config1.yml'
'myMLCode-0.0.1.data/data/config/config2.yml'
'myMLCode-0.0.1.data/data/models/model1.h5'
'myMLCode-0.0.1.data/data/models/model2.h5'
'myMLCode-0.0.1.dist-info/METADATA'
'myMLCode-0.0.1.dist-info/WHEEL'
'myMLCode-0.0.1.dist-info/top_level.txt'
'myMLCode-0.0.1.dist-info/RECORD'

I then installed this wheel file with pip install. I listed the libraries to check if its installed and then opened a python3.7 terminal to use this, but I get an Import Error.

[user@userhome~]$ pip3.7 list
Package            Version
------------------ -------
appdirs            1.4.4
distlib            0.3.1
filelock           3.0.12
importlib-metadata 4.0.1
pip                20.1.1
setuptools         47.1.0
six                1.15.0
typing-extensions  3.7.4.3
virtualenv         20.4.4
myMLCode           0.0.1
zipp               3.4.1
[user@userhome ~]$ python3.7
Python 3.7.9 (default, Apr 27 2021, 07:49:13)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import myMLCode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>

ModuleNotFoundError: No module named 'myMLCode'

I tried to unzip the package and tried to run the code with .so files directly. It works well, except for the config and model file references. The package puts the files in myMLCode.data.data.config and myMLCode.data.data.models. I did a hack and changed all relative paths in the source code to refer to this new location. It works with this, but this is not an elegant solution since the plain python code stops working since it doesnt know about these new folders.

Any pointers would be really helpful.

Therapeutics answered 12/5, 2021 at 10:42 Comment(4)
Already referred several related links, but could not find the full answer yet: #39499953 #56024786Therapeutics
First thing is to manage packaging the wheel without cythonizing. Why doesn't your setup declare packages list? Why are you using broken data_files instead of package_data? Why is main.py outside of ML_lib and not packaged?Attain
@Attain I had added data_files following ans mentioned in this link : #24347950 I had removed packages based on this: bucharjan.cz/blog/…, I had also followed your ans setup mentioned in this link: #39499953, and got the same package, but could not get data file into it.Therapeutics
The setup from my answer works, but your packaging right now is a mess. Again, my advice: write a setup script first that works without cythonizing. Don't use data_files. Include sources via packages=find_packages(). Don't import myMLCode since you don't have a package or module named like that. Start with easy things and add complex stuff incrementally.Attain
P
0

According to your folder structure, your module name should be ML_lib. Your wheel package name is not equal to the module name. If your module name is myMLCode, you need to add the following code to ML_lib/__init__.py:

from .myMLCode import * 

Then import the module in Python:

import ML_lib
Propylite answered 27/5, 2022 at 9:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.