Introduction
I’m having trouble making reproducible conda environments. I’ve posted the question below on Stack Overflow and Reddit but I’ve got nothing. I’m leaving the question here for easy future reference. If I come up with a solution I’ll update this post.
The question
Hi everyone.
I’m really struggling to create a reproducible conda environment. I’ll outline the approach I’ve taken so far and the issue I’ve encountered. I’d appreciate any tips for what I can do to troubleshoot next or resources I could check.
As background, I work on a small team, and want to be able to share a copy of an environment I’ve been using with other members of my team so I can be sure we have identical versions of all the libraries required for our work.
My current workflow is as follows:
- Write out an environment file with unpinned dependencies and let conda build the environment
name: example_env_build
channels:
- conda-forge
- defaults
dependencies:
- pandas
- requests
The actual environment has a lot more stuff in it, but that’s the idea
- I then create the environment with
conda env create -f example_env_build.yml
- I export the environment so that all versions and their dependencies will be pinned with
conda env export -n example_env_build --no-builds --file test_export.yml
. I added--no-builds
because I was finding the certain builds were getting marked as broken and causing issues and getting the version right seemed close enough for my purposes. - I edit the
test_export.yml
file and change the name toexample_env
and remove theprefix
line from the bottom. - I build a new environment with this pinned file just to make sure it goes ok, and then share the file with the rest of my team.
This has generally worked well if everyone tries to build the environment relatively quickly after the file is created. However, the whole point of being able to specify a reproducible environment is that I should be able to recreate that environment at any time. Someone on my team recently got a new computer so I was trying to help her set up her environment and ran into a series of conflicts. To troubleshoot I tried to rebuild the environment on my machine and ran into the same situation.
For troubleshooting I did the following: * Clone my environment so I have a backup while I mess around conda create --name example_env_clone --clone example_env
* Export the environment conda env export -n example_env --no-builds --file example_env_rebuild.yml
* Delete the example environment so I can rebuild it conda env remove --name example_env
* Try and recreate the environment I just exported conda env create -f example_env_rebuild.yml
From there I ran into all sorts of version conflicts. I don’t understand this because a) These are all versions being used in a working environment and b) a lot of the “conflicts” don’t seem to be conflicts to me. As an example, here’s one from my current attempt:
Package phik conflicts for:
phik=0.9.10
pandas-profiling=2.4.0 -> phik[version='>=0.9.8']
I picked that one basically at random but there are tons like that. As I read it I’m trying to install phik 0.9.10, and pandas-profiling requires >=0.9.8, which 0.9.10 satisfies.
I’m at my wits end here. I’ve read a million “how to manage conda environments” guides (For example this, this, and this) along with the conda environment management docs. All of them seem to indicate what I’m doing should work perfectly fine, but my team and I constantly run into issues.
Has anyone had a similar experience? Is there something I’m missing, or a resource I could consult? I’d greatly appreciate any pointers.
Thanks