• Project package managers: poetry, pdm, hatch (pypa)
    • poetry: good for apps. people complain its dep resolution is slow, and it’s nonstandard
    • hatch: doesn’t support locking dependencies yet (source)
    • Recent summary of the mess (and followup)
    • Nothing supports monorepos
  • hatch
    • Requires manually editing pyproject to manage deps
    • By default, hatch create env creates somewhere like /Users/yang/Library/Application Support/hatch/env/virtual/PROJECT/s85Jsv-M/PROJECT/bin/python
    • If you want to use local root, need to always pass --data-dir . to hatch, or set HATCH_DATA_DIR (source)
  • Starter projects
    • https://github.com/browniebroke/pypackage-template - using this
    • https://github.com/cjolowicz/cookiecutter-hypermodern-python
    • https://github.com/smarlhens/python-boilerplate
  • Why conda? Just because it provides you with some nice prebuilt binary python’s that are fast to install.
    • If you try to use asdf instead, you have to build from source, first installing a bunch of OS packages to do so
    • If you install from apt, then you are stuck with whatever version is in your OS
  • 2012-12
    • gpt-neox: requires cuda 11.7, so installed cu117 from runscript (choose to omit the kernel, and choose alternate path, but it seems to ignore alt path). Then add to PATH and LD_LIBRARY_PATH as post-install instructions say.

    • axolotl:

      • This fails to build:

        $ pip3 install packaging
        $ pip3 install -e '.[flash-attn,deepspeed]'
        ...
        Collecting flash-attn==2.3.3 (from axolotl==0.3.0)
          Downloading flash_attn-2.3.3.tar.gz (2.3 MB)
             ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 256.1 MB/s eta 0:00:00
          Installing build dependencies ... done
          Getting requirements to build wheel ... error
          error: subprocess-exited-with-error
        
          × Getting requirements to build wheel did not run successfully.
          │ exit code: 1
          ╰─> [20 lines of output]
              Traceback (most recent call last):
                File "/home/ubuntu/axolotl2/.direnv/python-3.11/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
                  main()
                File "/home/ubuntu/axolotl2/.direnv/python-3.11/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
                  json_out['return_val'] = hook(**hook_input['kwargs'])
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/home/ubuntu/axolotl2/.direnv/python-3.11/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
                  return hook(config_settings)
                         ^^^^^^^^^^^^^^^^^^^^^
                File "/tmp/pip-build-env-q4j7s6bv/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
                  return self._get_build_requires(config_settings, requirements=['wheel'])
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/tmp/pip-build-env-q4j7s6bv/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
                  self.run_setup()
                File "/tmp/pip-build-env-q4j7s6bv/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 480, in run_setup
                  super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
                File "/tmp/pip-build-env-q4j7s6bv/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
                  exec(code, locals())
                File "<string>", line 8, in <module>
              ModuleNotFoundError: No module named 'packaging'
              [end of output]
        
        # can't just run the whole pinstall with --no-build-isolation
        $ pip3 install --no-build-isolation --no-cache-dir -e '.[flash-attn,deepspeed]'
        Obtaining file:///home/ubuntu/axolotl2
          Checking if build backend supports build_editable ... done
          Preparing editable metadata (pyproject.toml) ... error
          error: subprocess-exited-with-error
        
          × Preparing editable metadata (pyproject.toml) did not run successfully.
          │ exit code: 1
          ╰─> [12 lines of output]
              running dist_info
              creating /tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info
              writing /tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/PKG-INFO
              writing dependency_links to /tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/dependency_links.txt
              writing requirements to /tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/requires.txt
              writing top-level names to /tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/top_level.txt
              writing manifest file '/tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/SOURCES.txt'
              reading manifest file '/tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/SOURCES.txt'
              adding license file 'LICENSE'
              writing manifest file '/tmp/pip-modern-metadata-rzwoxot9/axolotl.egg-info/SOURCES.txt'
              creating '/tmp/pip-modern-metadata-rzwoxot9/axolotl-0.3.0.dist-info'
              error: invalid command 'bdist_wheel'
              [end of output]
        
          note: This error originates from a subprocess, and is likely not a problem with pip.
        error: metadata-generation-failed
        
        × Encountered error while generating package metadata.
        ╰─> See above for output.
        
        note: This is an issue with the package mentioned above, not pip.
        
        # needs to do 
        
        pip3 install packaging wheel
        pip3 install --no-build-isolation flash-attn==2.3.3
        pip3 install -e '.[flash-attn,deepspeed]'
        
      • At import time, I end up with an error importing flash_attn:

      Traceback (most recent call last):
        File "<string>", line 1, in <module>
        File "/home/ubuntu/axolotl/.direnv/python-3.11/lib/python3.11/site-packages/flash_attn/__init__.py", line 3, in <module>
          from flash_attn.flash_attn_interface import (
        File "/home/ubuntu/axolotl/.direnv/python-3.11/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 8, in <module>
          import flash_attn_2_cuda as flash_attn_cuda
      ImportError: /home/ubuntu/axolotl/.direnv/python-3.11/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
      
      • requires flash-attn, whose flash_attn has a flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so that depends on _ZN3c104cuda9SetDeviceEi

      • axolotl Docker builds with pytorch 2.0.1+cu117, though, seemingly without additional magic pip commands

      • In Docker, it doesn’t have this symbol either, so it’s not even looking for it - how is flash-attn getting installed?

        root@7d356900a935:/workspace/axolotl# python -c 'import torch; print(torch.__version__)'
        2.0.1+cu117
        root@7d356900a935:/workspace/axolotl# python -c 'import torch; print(torch.__file__)'
        /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/torch/__init__.py
        root@7d356900a935:/workspace/axolotl# find /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/ -name libtorch_cuda.so
        /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/torch/lib/libtorch_cuda.so
        root@7d356900a935:/workspace/axolotl# nm -u /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/torch/lib/libtorch_cuda.so | grep -i setdevice
                         U cudaSetDevice@@libcudart.so.11.0
        
        
      • https://github.com/Dao-AILab/flash-attention/issues/620

      • In host, I can make it work with:

        # installs torch 2.1.1+cu121
        pip install --extra-index-url <https://huggingface.github.io/autogptq-index/whl/cu118/> auto-gptq==0.5.1
        
        pip install flash-attn==2.3.3
        
      • Got it working on host also with: (LYING, not sure how I got it working, but this isn’t sufficient)

        # maybe having torch did it? or maybe having specifically torch 2.1.2 and cuda 12.1?
        $ pip3 install --no-build-isolation torch packaging wheel
        $ pip3 install --no-build-isolation --no-cache-dir flash-attn==2.3.3
        $ python -c 'import flash_attn' # no error!
        $ pip3 install --no-cache-dir -e '.[flash-attn,deepspeed]'
        
      • For some reason, inside Docker, flash-attn builds/installs just fine - it can see packaging!

      • Thought maybe there was something funky with conda, but happens even with a normal ubuntu python, just (as root) pip install packaging and then pip install flash-attn can see packaging

        With conda, these are the paths it sees:
        
        /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/packaging/__init__.py
        sys.path is ['', '/root/miniconda3/envs/py3.9/lib/python39.zip', '/root/miniconda3/envs/py3.9/lib/python3.9', '/root/miniconda3/envs/py3.9/lib/python3.9/lib-dynload', '/root/miniconda3/envs/py3.9/lib/python3.9/site-packages']
        
        With normal ubuntu python, these are the paths it sees:
        
        /usr/local/lib/python3.10/dist-packages/packaging/__init__.py
        sys.path is ['', '/usr/lib/python310.zip', '/usr/lib/python3.10', '/usr/lib/python3.10/lib-dynload', '/usr/local/lib/python3.10/dist-packages', '/usr/lib/python3/dist-packages', '/tmp/pip-install-r7e9o_tz/flash-attn_4a95b88dd57f426096ebebb3b173b1a3']
        
        Not sure what you see if you use venv, not sure how to run `pip install .` in a way that targets a certain venv
        But I can definitely see this is where things break down
        
      • So this seems like something specific to using venvs