Poetry include additional data files in wheel
Asked Answered
A

1

13

I have a simple python package, let's call it my_package.

Its files are located in src/python/my_package. In addition, there is a data folder in the repository root, which should be included in the resulting python wheel within the my_package.

.
├── src
    └── python
        └── my_package
├── data
    └── stuff.json
├── pyproject.toml

I did not find any way to configure poetry that it includes the additional data folder in the correct way.

Here is my pyproject.toml

[tool.poetry]
name = "my-package"
version = "2.10.0" 

packages = [
    { include = "my_package", from = "src/python" }
]

# This does not work! It puts the `data` folder into site-packages/data instead of site-packages/my_package/data
include = [
    { path = "data", format = ["sdist", "wheel"] }
]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

I also found the following solution, using a pre-build script: https://github.com/python-poetry/poetry/issues/5539#issuecomment-1126818974

Problem: it changes the wheel that it is not pure anymore, but depends on CPython.

Also tried with symlink, but that does not work: How to include symlinks and the linked file in a python wheel using Poetry?

Question: What is the correct way to include additional resource files in a python wheel using poetry?

Ana answered 3/4, 2023 at 7:10 Comment(6)
Your data files must be part of the importable package. Can't you move the data directory to src/python/my_package/data? This would make everything much easier.Change
Unfortunately, that's not an option, since the data folder is also ised by some other scripts within the repo, where it cannot be moved.Ana
A. I think I know how I would solve this with setuptools, but I can not seem to find a straightforward solution with Poetry. Are you bound to Poetry? Would you consider changing to setuptools for this? -- B. I gave it a try to solve this with Poetry's build.py but hit the same wheel tag issue, asked the maintainers and there is no solution to fix wheel tags from build.py. I think you can use the wheel tool to fix the wheel tags afterwards -- C. Maybe you could have a pre-build step (maybe a shell script) to copy data to src/python/my_package/data.Change
Related: github.com/python-poetry/poetry-core/pull/227 -- On the other hand I tried using the wheel tool to fix the tags, and it seems to work. But this means the project needs a post-build step (in the CI/CD for example). And it seems it would be better to have a pre-build step instead that copies the data directory to the right location, because then there is no need for build.py at all.Change
Hatch could probably do it easily as well: hatch.pypa.io/latest/config/build/#rewriting-pathsChange
have you tried include = [{ path = "data/*", format = ["sdist", "wheel"] }] ?Shuddering
D
1

As you mentioned, I think that include is the correct way to handle it.

I would rearrange the directory structure to allow for pyproject.toml to handle it easier. I am thinking something like the following:

mypackage
├── mypackage
|    └── src
|    |   └── my_package.py
|    └── data
|        └── stuff.json
├── pyproject.toml

Then put the appropriate __init__.py files in place for discovery (if needed).

Finally, I think the following adjustments to the pyproject.toml should work.

packages = [
    { include = "my_package", from = "mypackage/src" }
]

include = [
    { path = "mypackage/data", format = ["sdist", "wheel"] }
]

I cannot test, because I use PDM instead of Poetry, and Poetry has a few quirks where it doesn't fully follow the standard. But I think that should be right.

Dissimulation answered 8/5 at 15:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.