Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions recipes/array-record/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

set -xe

export PYTHON_VERSION=$(${PYTHON} -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')")
export PYTHON_MAJOR_VERSION=$(echo $PYTHON_VERSION | cut -d. -f1)
export PYTHON_MINOR_VERSION=$(echo $PYTHON_VERSION | cut -d. -f2)
export BAZEL_VERSION="7.2.1"
export OUTPUT_DIR="$(pwd)"
export SOURCE_DIR="."
. "./oss/runner_common.sh"

setup_env_vars_py "$PYTHON_MAJOR_VERSION" "$PYTHON_MINOR_VERSION"

function write_to_bazelrc() {
echo "$1" >> .bazelrc
}

write_to_bazelrc "build -c opt"
write_to_bazelrc "build --cxxopt=-std=c++17"
write_to_bazelrc "build --host_cxxopt=-std=c++17"
write_to_bazelrc "build --experimental_repo_remote_exec"
write_to_bazelrc "build --python_path=\"${PYTHON_BIN}\""
write_to_bazelrc "build --incompatible_default_to_explicit_init_py"
write_to_bazelrc "build --enable_platform_specific_config"
write_to_bazelrc "build --@rules_python//python/config_settings:python_version=${PYTHON_VERSION}"
write_to_bazelrc "test --@rules_python//python/config_settings:python_version=${PYTHON_VERSION}"
write_to_bazelrc "test --action_env PYTHON_VERSION=${PYTHON_VERSION}"
write_to_bazelrc "test --test_timeout=300"
write_to_bazelrc "test --python_path=\"${PYTHON_BIN}\""
write_to_bazelrc "common --check_direct_dependencies=error"

export USE_BAZEL_VERSION="${BAZEL_VERSION}"
bazel clean
bazel build ... --action_env PYTHON_BIN_PATH="${PYTHON_BIN}"

DEST="${OUTPUT_DIR}"'/all_dist'
mkdir -p "${DEST}/array_record"

TMPDIR="$(mktemp -d -t tmp.XXXXXXXXXX)"
cp setup.py "${TMPDIR}"
cp LICENSE "${TMPDIR}"
rsync -avm -L --exclude="bazel-*/" . "${TMPDIR}/array_record"
rsync -avm -L --include="*.so" --include="*_pb2.py" \
--exclude="*.runfiles" --exclude="*_obj" --include="*/" --exclude="*" \
bazel-bin/cpp "${TMPDIR}/array_record"
rsync -avm -L --include="*.so" --include="*_pb2.py" \
--exclude="*.runfiles" --exclude="*_obj" --include="*/" --exclude="*" \
bazel-bin/python "${TMPDIR}/array_record"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rsync -avm -L --include="*.so" --include="*_pb2.py" \
--exclude="*.runfiles" --exclude="*_obj" --include="*/" --exclude="*" \
bazel-bin/cpp "${TMPDIR}/array_record"
rsync -avm -L --include="*.so" --include="*_pb2.py" \
--exclude="*.runfiles" --exclude="*_obj" --include="*/" --exclude="*" \
bazel-bin/python "${TMPDIR}/array_record"
rsync -avm -L --include="*${SHLIB_EXT}" --include="*_pb2.py" \
--exclude="*.runfiles" --exclude="*_obj" --include="*/" --exclude="*" \
bazel-bin/cpp "${TMPDIR}/array_record"
rsync -avm -L --include="*${SHLIB_EXT}" --include="*_pb2.py" \
--exclude="*.runfiles" --exclude="*_obj" --include="*/" --exclude="*" \
bazel-bin/python "${TMPDIR}/array_record"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


previous_wd="$(pwd)"
cd "${TMPDIR}"
printf '%s : === Building wheel\n' "$(date)"
$PYTHON setup.py bdist_wheel --python-tag py3"${PYTHON_MINOR_VERSION}"

cp dist/*.whl "${DEST}"

printf '%s : === Listing wheel\n' "$(date)"
ls -lrt "${DEST}"/*.whl
cd "${previous_wd}"

printf '%s : === Output wheel file is in: %s\n' "$(date)" "${DEST}"

$PYTHON -m pip install "${DEST}"/array_record*.whl
74 changes: 74 additions & 0 deletions recipes/array-record/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
{% set name = "array-record" %}
{% set version = "0.8.0a1" %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally we don't allow alpha tags on here, but I'm somewhat inclined to grant an exception here because this a bazel project and we're lucky it even has tags at all. Any thoughts on this from the rest of @conda-forge/staged-recipes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can get by for now only Linux

conda-forge/tensorflow-datasets-feedstock#23
array-record is a major bottleneck for tensorflow-datasets, where this is dependent on Linux, so having even just Linux will help a lot.

{% set sha256 = "60d73cd0f038de7a1e8c376b143a5a148b85b1fc316569f4c73a33b8be3dbf80" %}
{% set bazel_version = "7.2.1" %}

package:
name: {{ name|lower }}
version: {{ version }}

source:
url: https://github.com/google/array_record/archive/v{{ version }}.tar.gz
sha256: {{ sha256 }}

build:
number: 0
skip: true # [py<310]
skip: true # [win]
skip: true # [osx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
skip: true # [osx]

Let's enable macOS and see what it takes for it to build. Given that this is a bazel build, I won't ask for us to try Windows until the feedstock exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to skip osx for now:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, bazel is a such a painful system to build with we can get by for now only Linux but let's make sure check into macOS arm64 afterward. The build failure you linked seems to be more of an issue with how highway-hash was built than something specific to macOS x86, but it's so difficult to debug bazel that I don't have any suggestions yet.


requirements:
build:
- python # [target_platform != build_platform]
- pip # [target_platform != build_platform]
- numpy # [target_platform != build_platform]
- setuptools # [target_platform != build_platform]
- wheel # [target_platform != build_platform]
- bazel 7.2.1
- rsync
- {{ compiler('c') }}
- {{ stdlib('c') }}
- {{ compiler('cxx') }}
host:
- python
- pip
- numpy
- setuptools
- wheel
run:
- python
- etils
- absl-py
- numpy
- more-itertools >=9.1.0
- typing_extensions
- importlib_resources
- zipp
- fsspec

test:
imports:
- array_record
files:
- test_array_record.py
requires:
- pip
- python

about:
home: https://pypi.org/project/array-record/
summary: ArrayRecord file format.
description: |
ArrayRecord is a new file format derived from [Riegeli](https://github.com/google/riegeli),
achieving a new frontier of IO efficiency. We designed ArrayRecord to support parallel
read, write, and random access by record index. ArrayRecord builds on top of Riegeli
and supports the same compression algorithms.
license: Apache-2.0
license_family: Apache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
license_family: Apache

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

license_file: LICENSE
dev_url: https://github.com/google/array-record/

extra:
recipe-maintainers:
- iindyk
- mtsokol
1 change: 1 addition & 0 deletions recipes/array-record/test_array_record.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from array_record.python import array_record_data_source