My code structure is as follows:
myMLCode
│
├── main.py
├── ML_lib
| ├── __init__.py
│ └── core.py
| └── set1
| ├── __init__.py
│ └── mymod1.py
│ └── mymod2.py
| └── set2
| ├── __init__.py
│ └── mymod3.py
│ └── mymod4.py
├── config
│ ├── config1.yml
│ └── config2.yml
├── models
│ ├── model1.h5
│ └── model2.h5
├── setup.py
What I would like to do is to make a wheel file using the cythonized code from this whole package and be able to run the code seamlessly.
Expectation is to run with python main.py
Plus I want to edit the config files and update the model files from time to time and continue to use the package.
What I managed to do so far is with the following setup.py file:
from Cython.Distutils import build_ext
from Cython.Build import cythonize
from setuptools.extension import Extension
from setuptools.command.build_py import build_py as build_py_orig
from pathlib import Path
from setuptools import find_packages, setup, Command
import os
import shutil
class MyBuildExt(build_ext):
def run(self):
build_ext.run(self)
build_dir = Path(self.build_lib)
root_dir = Path(__file__).parent
target_dir = build_dir if not self.inplace else root_dir
self.copy_file('ML_lib/__init__.py', root_dir, target_dir)
self.copy_file('ML_lib/set1/__init__.py', root_dir, target_dir)
self.copy_file('Ml_lib/set2/__init__.py', root_dir, target_dir)
def copy_file(self, path, source_dir, destination_dir):
if not (source_dir / path).exists():
return
shutil.copyfile(str(source_dir / path), str(destination_dir / path))
extensions = [
Extension("core", ["core.py"]),
Extension("ML_lib.set1.*", ["ML_lib/set2/*.py"]),
Extension("ML_lib.set2.*", ["ML_lib/set2/*.py"]),
Extension("ML_lib.*", ["ML_lib/*.py"]),
]
setup(
name="myMLCode",
version="0.0.1",
author="myself",
description="This is compiled ML code",
ext_modules=cythonize(
extensions,
build_dir="build",
compiler_directives=dict(
always_allow_keywords=True
)),
data_files=[
('config',['config/config1.yml']),
('config',['config/config2.yml']),
('models',['models/model1.h5']),
('models',['models/model2h5']),
],
cmdclass={
'build_ext': MyBuildExt
},
entry_points={
},
)
This makes a wheel file which contains the following:
myMLCode-0.0.1-cp37-cp37m-linux_x86_64.whl
------------------------------------------
main.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/__init__.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/__init__.py'
'ML_lib/core.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/mymod1.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/mymod2.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/__init__.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set1/__init__.py'
'ML_lib/set2/mymod3.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set2/mymod4.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set2/__init__.cpython-37m-x86_64-linux-gnu.so'
'ML_lib/set2/__init__.py'
'myMLCode-0.0.1.data/data/config/config1.yml'
'myMLCode-0.0.1.data/data/config/config2.yml'
'myMLCode-0.0.1.data/data/models/model1.h5'
'myMLCode-0.0.1.data/data/models/model2.h5'
'myMLCode-0.0.1.dist-info/METADATA'
'myMLCode-0.0.1.dist-info/WHEEL'
'myMLCode-0.0.1.dist-info/top_level.txt'
'myMLCode-0.0.1.dist-info/RECORD'
I then installed this wheel file with pip install. I listed the libraries to check if its installed and then opened a python3.7 terminal to use this, but I get an Import Error.
[user@userhome~]$ pip3.7 list
Package Version
------------------ -------
appdirs 1.4.4
distlib 0.3.1
filelock 3.0.12
importlib-metadata 4.0.1
pip 20.1.1
setuptools 47.1.0
six 1.15.0
typing-extensions 3.7.4.3
virtualenv 20.4.4
myMLCode 0.0.1
zipp 3.4.1
[user@userhome ~]$ python3.7
Python 3.7.9 (default, Apr 27 2021, 07:49:13)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import myMLCode
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'myMLCode'
I tried to unzip the package and tried to run the code with .so files directly. It works well, except for the config and model file references. The package puts the files in myMLCode.data.data.config and myMLCode.data.data.models. I did a hack and changed all relative paths in the source code to refer to this new location. It works with this, but this is not an elegant solution since the plain python code stops working since it doesnt know about these new folders.
Any pointers would be really helpful.
packages
list? Why are you using brokendata_files
instead ofpackage_data
? Why ismain.py
outside ofML_lib
and not packaged? – Attaindata_files
. Include sources viapackages=find_packages()
. Don't importmyMLCode
since you don't have a package or module named like that. Start with easy things and add complex stuff incrementally. – Attain