Skip to content

Commit

Permalink
[infra] Upgrade Python to 3.10.14 in base-builder & base-runner Images (
Browse files Browse the repository at this point in the history
#12027)

> [!NOTE]  
> I was looking for somewhere to get feedback from maintainers about
this approach to the Python 3.10 upgrade before attempting it, but the
discussion surrounding a Python upgrade has been rather fragmented
across many issues, PRs, and comment chains.
>
> For that reason, I felt it would be easier to propose with a working
example and dedicated PR.


#### Fixes:
- #11419
- #9638

#### Supersedes:
- #9532
- #11420


## Changes

The changes introduced here upgrade Python from 3.8 to 3.10.14 inside
the base-builder and base-runner images.

### Base Image Changes

| Image | Before Changes | After Changes |

|----------------|------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **base-builder** | Compiled Python 3.8 from source using official
release servers at https://www.python.org/ftp/python/. | Compiles Python
3.10.14 (the latest 3.10 release) from source using official release
servers at https://www.python.org/ftp/python/. |
| **base-runner** | Installed Python 3.8 from the default apt repository
provided by the Ubuntu 20.04 image. | Uses a multi-stage build to copy
the Python 3.10.14 interpreter compiled by the base-builder image,
ensuring version sync and saving build time by re-using a pre-built
version. |


## Known Impact on Projects

### 3.9 Workarounds That Can Be Removed

| Project    | Fix Link |
|------------|----------|
| dask |
DaveLak@417bbf5
|
| docutils |
DaveLak@e4c21ff
|
| dovecot |
DaveLak@7ab3ab6
|
| nbclassic |
DaveLak@5509b4e
|
| pandas |
DaveLak@0642a7a
|
| pybind11 |
DaveLak@a5bbdb3
|
| pyodbc |
DaveLak@afa2b5e
|
| qpid-proton|
DaveLak@f5bf756
|

### Anticipated Build Failures

#### Preexisting Failures 

##### Fix is Prepared

| Project               | Fix Link |
|-----------------------|----------|
| airflow |
DaveLak@60a0368
|
| ipython |
DaveLak@21ac68e
|
| networkx |
DaveLak@fc2f8c5
|
| numpy |
DaveLak@9383c87
|
| tensorflow-addons |
DaveLak@eed2bea
|
| django (coverage build)|
DaveLak@c724d61
|
| proto-plus-python |
DaveLak@37d973e
|
| dnspython | The upgraded pip version in the base-builder fixes the
currently failing build. |

##### Fix Requires Upstream Changes

| Project | Issue |
|---------|-------|
| pyvex | Currently failing on python 3.9 because `archinfo` dependency
requires >=3.10. Fails after the 3.10 upgrade because [the upstream
build script needs `python3.9` replaced with
`python3`](https://github.com/angr/pyvex/blob/f94c95636a3800c5bbd781ecf1e3fb0c0d9feec4/fuzzing/build.sh#L19-L23).
|

##### Requires More Investigation

| Project            | Issue |
|--------------------|-------|
| matplotlib | Upgrading Python & Pyinstaller does resolve the build
issues, but an error in the fuzz harness is exposed and must be resolved
for check_build to pass. The exception: `TypeError: Parser.non_math()
takes 2 positional arguments but 4 were given" in "File "fuzz_plt.py",
line 43, in TestOneInput`. |
| scipy | Upgrading Python & Pyinstaller does resolve the build issues,
but an error in the build step causes the build to fail. The error seems
related to the linking: "/usr/bin/ld: /usr/bin/ld: DWARF error: invalid
or unhandled FORM value: 0x25". When `export LDFLAGS="-fuse-ld=lld"` is
set, the error becomes: "`ld.lld: error: undefined symbol:
__asan_report_store4`". |
| pandas (Introspector only)| [This workaround in `build.sh` is the
issue](https://github.com/google/oss-fuzz/blob/1515519a665756d8a50a6c46abac8b431e5462ef/projects/pandas/build.sh#L22-L32).
|
| pycrypto | Failing with error: "`SystemError: PY_SSIZE_T_CLEAN macro
must be defined for '#' formats`". Seems like the issue described
[here](https://stackoverflow.com/a/71019907). Pycrypto is deprecated and
this is unlikely to be fixed upstream. |


## Possible Future Improvements

Using the base-builder image in a multi-stage build to copy the pre-
compiled Python into base-runner is effective, but feels like a
workaround that may be introducing tech debt. A cleaner approach would
be to extract the Python compilation into a discrete base image similar
to how `base-clang` works, and use that as the multi-stage builder in
images that need it.

### Fuzz Introspector Caveat

Fuzz Introspector currently uses Python 3.9. While an upgrade to 3.10 is
not expected to introduce any new issues, it was not tested on these
changes and may require additional work.

---

## Motivation

- Python [3.8 is reaching end of life in October
2024](https://devguide.python.org/versions/).
- The [Scientific Python Community already encourages dropping 3.8
support](https://scientific-python.org/specs/spec-0000/).
- This is evident when looking at which projects have resorted to
upgrading to newer Pythons using ad-hoc workarounds (see `numpy`,
`scipy`, `pandas`, etc.)
- It is likely that more Python projects will begin dropping support for
3.8, further increasing the number of broken builds and ad-hoc
workarounds.
- Code coverage does not work on Python projects that use Python 3.10+
syntax.
- Previous attempts at upgrading Python have stalled (see
google/clusterfuzz#3290 (comment)
& the issues linked under "Supersedes" above.)
- In recognition of the fact that OSS-Fuzz maintainers are stretched
thin, I thought I'd give it a shot.

---------

Co-authored-by: Oliver Chang <[email protected]>
Co-authored-by: Andrew Murray <[email protected]>
  • Loading branch information
3 people authored Nov 25, 2024
1 parent 9dad89d commit 93b417e
Show file tree
Hide file tree
Showing 20 changed files with 68 additions and 59 deletions.
14 changes: 8 additions & 6 deletions infra/base-images/base-builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ FROM gcr.io/oss-fuzz-base/base-clang
COPY install_deps.sh /
RUN /install_deps.sh && rm /install_deps.sh

# Build and install latest Python 3 (3.8.3).
ENV PYTHON_VERSION 3.8.3
RUN export PYTHON_DEPS="\
# Build and install latest Python 3.10.
ENV PYTHON_VERSION 3.10.14
RUN PYTHON_DEPS="\
zlib1g-dev \
libncurses5-dev \
libgdbm-dev \
Expand All @@ -39,12 +39,14 @@ RUN export PYTHON_DEPS="\
tar -xvf Python-$PYTHON_VERSION.tar.xz && \
cd Python-$PYTHON_VERSION && \
./configure --enable-optimizations --enable-shared && \
make -j install && \
make -j$(nproc) install && \
ldconfig && \
ln -s /usr/bin/python3 /usr/bin/python && \
ln -s /usr/local/bin/python3 /usr/local/bin/python && \
cd .. && \
rm -r /tmp/Python-$PYTHON_VERSION.tar.xz /tmp/Python-$PYTHON_VERSION && \
rm -rf /usr/local/lib/python3.8/test && \
rm -rf /usr/local/lib/python${PYTHON_VERSION%.*}/test && \
python3 -m ensurepip && \
python3 -m pip install --upgrade pip && \
apt-get remove -y $PYTHON_DEPS # https://github.com/google/oss-fuzz/issues/3888


Expand Down
4 changes: 2 additions & 2 deletions infra/base-images/base-builder/compile_python_fuzzer
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ if [[ $SANITIZER = *introspector* ]]; then
# we enter the virtual environment in the following lines because we need
# to use the same python environment that installed the fuzzer dependencies.
python3 /fuzz-introspector/frontends/python/prepare_fuzz_imports.py $fuzzer_path isossfuzz

# We must ensure python3.9, this is because we use certain
# AST logic from there.
# The below should probably be refined
Expand Down Expand Up @@ -84,7 +84,7 @@ then
if [[ ! -d "/pysecsan" ]];
then
pushd /usr/local/lib/sanitizers/pysecsan
python3 setup.py install
python3 -m pip install .
popd
fi

Expand Down
2 changes: 1 addition & 1 deletion infra/base-images/base-builder/install_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ echo "ATHERIS INSTALL"
unset CFLAGS CXXFLAGS
# PYI_STATIC_ZLIB=1 is needed for installing pyinstaller 5.0
export PYI_STATIC_ZLIB=1
LIBFUZZER_LIB=$( echo /usr/local/lib/clang/*/lib/x86_64-unknown-linux-gnu/libclang_rt.fuzzer_no_main.a ) pip3 install -v --no-cache-dir "atheris>=2.1.1" "pyinstaller==5.0.1" "setuptools==42.0.2" "coverage==6.3.2"
LIBFUZZER_LIB=$( echo /usr/local/lib/clang/*/lib/x86_64-unknown-linux-gnu/libclang_rt.fuzzer_no_main.a ) pip3 install -v --no-cache-dir "atheris>=2.3.0" "pyinstaller==6.10.0" "setuptools==72.1.0" "coverage==6.3.2"
rm -rf /tmp/*
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ def hook_pre_exec_os_system(cmd):
'Command injection')


def hook_pre_exec_eval(cmd):
def hook_pre_exec_eval(cmd, *args, **kwargs):
"""Hook for eval. Experimental atm."""
res = check_code_injection_match(cmd, check_unquoted=True)
if res is not None:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
import functools
import subprocess
import traceback
import importlib
import importlib.util

from typing import Any, Callable, Optional
from pysecsan import command_injection, redos, yaml_deserialization
Expand Down Expand Up @@ -54,7 +54,7 @@ def sanitizer_log_always(msg, log_prefix=True):
def is_module_present(mod_name):
"""Identify if module is importable."""
# pylint: disable=deprecated-method
return importlib.find_loader(mod_name) is not None
return importlib.util.find_spec(mod_name) is not None


def _log_bug(bug_title):
Expand Down
22 changes: 21 additions & 1 deletion infra/base-images/base-runner/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ RUN cargo install rustfilt
FROM gcr.io/oss-fuzz-base/base-clang AS base-clang
FROM gcr.io/oss-fuzz-base/base-builder-ruby AS base-ruby

# The base builder image compiles a specific Python version. Using a multi-stage build
# to copy that same Python interpreter into the runner image saves build time and keeps
# the Python versions in sync.
FROM gcr.io/oss-fuzz-base/base-builder AS base-builder

# Real image that will be used later.
FROM gcr.io/oss-fuzz-base/base-image

Expand All @@ -36,6 +41,18 @@ COPY --from=base-clang /usr/local/bin/llvm-cov \
/usr/local/bin/llvm-symbolizer \
/usr/local/bin/

# Copy the pre-compiled Python binaries and libraries
COPY --from=base-builder /usr/local/bin/python3.10 /usr/local/bin/python3.10
COPY --from=base-builder /usr/local/lib/libpython3.10.so.1.0 /usr/local/lib/libpython3.10.so.1.0
COPY --from=base-builder /usr/local/include/python3.10 /usr/local/include/python3.10
COPY --from=base-builder /usr/local/lib/python3.10 /usr/local/lib/python3.10
COPY --from=base-builder /usr/local/bin/pip3 /usr/local/bin/pip3

# Create symbolic links to ensure compatibility
RUN ldconfig && \
ln -s /usr/local/bin/python3.10 /usr/local/bin/python3 && \
ln -s /usr/local/bin/python3.10 /usr/local/bin/python

COPY install_deps.sh /
RUN /install_deps.sh && rm /install_deps.sh

Expand All @@ -46,8 +63,11 @@ RUN git clone https://chromium.googlesource.com/chromium/src/tools/code_coverage
cd /opt/code_coverage && \
git checkout edba4873b5e8a390e977a64c522db2df18a8b27d && \
pip3 install wheel && \
# If version "Jinja2==2.10" is in requirements.txt, bump it to a patch version that
# supports upgrading its MarkupSafe dependency to a Python 3.10 compatible release:
sed -i 's/Jinja2==2.10/Jinja2==2.10.3/' requirements.txt && \
pip3 install -r requirements.txt && \
pip3 install MarkupSafe==0.23 && \
pip3 install MarkupSafe==2.0.1 && \
pip3 install coverage==6.3.2

# Default environment options for various sanitizers.
Expand Down
4 changes: 1 addition & 3 deletions infra/base-images/base-runner/install_deps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,10 @@
apt-get update && apt-get install -y \
binutils \
file \
ca-certificates \
fonts-dejavu \
git \
libcap2 \
python3 \
python3-pip \
python3-setuptools \
rsync \
unzip \
wget \
Expand Down
2 changes: 1 addition & 1 deletion projects/configparser/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM gcr.io/oss-fuzz-base/base-builder-python
RUN pip3 install --upgrade pip
RUN python -m pip install 'setuptools~=69.0'
RUN git clone https://github.com/jaraco/configparser/ configparser
COPY *.sh *py $SRC/
WORKDIR $SRC/configparser
2 changes: 1 addition & 1 deletion projects/configparser/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# limitations under the License.
#
################################################################################
pip3 install .
python -m pip install .
# Build fuzzers in $OUT.
for fuzzer in $(find $SRC -name 'fuzz_*.py'); do
compile_python_fuzzer $fuzzer
Expand Down
10 changes: 1 addition & 9 deletions projects/django/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,7 @@

FROM gcr.io/oss-fuzz-base/base-builder-python

RUN apt-get update -y \
&& apt-get install -y libgdal26 software-properties-common \
&& add-apt-repository -y ppa:deadsnakes/ppa \
&& apt-get install -y python3.10 python3.10-dev \
&& ln --force -s /usr/bin/python3.10 /usr/local/bin/python3 \
&& curl -sS https://bootstrap.pypa.io/get-pip.py | python3 \
&& python3 -m pip install -v --no-cache-dir "atheris>=2.1.1" "pyinstaller==5.0.1" "coverage==6.3.2" \
&& rm -rf /var/lib/apt/lists/*

RUN python3 -m pip install cython
RUN git clone --depth 1 https://github.com/django/django.git
RUN git clone --depth 1 https://github.com/django/django-fuzzers.git

Expand Down
5 changes: 2 additions & 3 deletions projects/fwupd/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,8 @@
################################################################################

FROM gcr.io/oss-fuzz-base/base-builder
RUN apt-get update
RUN apt-get install -y pkg-config zlib1g-dev libffi-dev liblzma-dev libcbor-dev
RUN pip3 install -U meson ninja
RUN apt-get update && apt-get install -y pkg-config zlib1g-dev libffi-dev liblzma-dev libcbor-dev
RUN python3 -m pip install -U jinja2 packaging meson ninja
RUN git clone --depth 1 https://github.com/fwupd/fwupd.git fwupd
WORKDIR .
COPY build.sh $SRC/
2 changes: 1 addition & 1 deletion projects/nbclassic/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@
################################################################################
pip3 install .
for fuzzer in $(find $SRC -name 'fuzz_*.py'); do
compile_python_fuzzer $fuzzer --add-data $SRC/jsonschema_specifications/jsonschema_specifications/schemas:jsonschema_specifications/schemas --add-data /usr/local/lib/python3.8/site-packages/jupyter_events/schemas:jupyter_events/schemas
compile_python_fuzzer $fuzzer --add-data $SRC/jsonschema_specifications/jsonschema_specifications/schemas:jsonschema_specifications/schemas --add-data /usr/local/lib/python3.10/site-packages/jupyter_events/schemas:jupyter_events/schemas
done
3 changes: 2 additions & 1 deletion projects/pffft/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
################################################################################

FROM gcr.io/oss-fuzz-base/base-builder
RUN apt-get update && apt-get install -y mercurial python-numpy python
RUN apt-get update && apt-get install -y mercurial
RUN python -m pip install numpy
RUN git clone https://bitbucket.org/jpommier/pffft $SRC/pffft
WORKDIR pffft
COPY build.sh $SRC
Expand Down
25 changes: 11 additions & 14 deletions projects/proto-plus-python/fuzz_json_serialization.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,32 +18,29 @@
import proto
from google.protobuf.json_format import ParseError


class FuzzMsg(proto.Message):
val1 = proto.Field(proto.FLOAT, number=1)
val2 = proto.Field(proto.INT32, number=2)
val3 = proto.Field(proto.BOOL, number=3)
val4 = proto.Field(proto.STRING, number=4)


def TestOneInput(data):
fdp = atheris.FuzzedDataProvider(data)

class FuzzMsg(proto.Message):
val1 = proto.Field(proto.FLOAT, number=1)
val2 = proto.Field(proto.INT32, number=2)
val3 = proto.Field(proto.BOOL, number=3)
val4 = proto.Field(proto.STRING, number=4)

try:
s = FuzzMsg.from_json(fdp.ConsumeUnicodeNoSurrogates(sys.maxsize))
FuzzMsg.to_json(s)
except ParseError:
pass
except TypeError:
pass
except RecursionError:
pass
except (ParseError, TypeError, RecursionError):
return


def main():
atheris.instrument_all()
atheris.Setup(sys.argv, TestOneInput, enable_python_coverage=True)
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()


if __name__ == "__main__":
main()

3 changes: 1 addition & 2 deletions projects/pybind11/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@
FROM gcr.io/oss-fuzz-base/base-builder

RUN apt-get update && \
apt-get install -y python-is-python3 build-essential pip python3-dev
RUN python3 -m pip install --upgrade pip
apt-get install -y build-essential

RUN git clone https://github.com/pybind/pybind11
COPY build.sh *_fuzzer.cc $SRC/
Expand Down
6 changes: 3 additions & 3 deletions projects/pybind11/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ cmake -S . -B build -DDOWNLOAD_CATCH=ON -DDOWNLOAD_EIGEN=ON
cmake --build build -j4
python3 -m pip install .

cp /usr/local/lib/libpython3.8.so.1.0 $OUT/
cp /usr/local/lib/libpython3.10.so.1.0 $OUT/
for f in $SRC/*_fuzzer.cc; do
fuzzer=$(basename "$f" _fuzzer.cc)
$CXX $CXXFLAGS \
-I$SRC/pybind11/include -isystem /usr/local/include/python3.8 \
-I$SRC/pybind11/include -isystem /usr/local/include/python3.10 \
$SRC/${fuzzer}_fuzzer.cc -o $OUT/${fuzzer}_fuzzer \
/usr/local/lib/libpython3.8.so.1.0 \
/usr/local/lib/libpython3.10.so.1.0 \
$LIB_FUZZING_ENGINE -lpthread
patchelf --set-rpath '$ORIGIN/' $OUT/${fuzzer}_fuzzer
done
4 changes: 3 additions & 1 deletion projects/pycrypto/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@
#
################################################################################

FROM gcr.io/oss-fuzz-base/base-builder-python
# Held back because of github.com/google/oss-fuzz/pull/12027
# Please fix failure and upgrade if possible.
FROM gcr.io/oss-fuzz-base/base-builder-python@sha256:d8fe5e2a6a96723f393de413c48d9455a5124995b2349a2e4d6b9abecf99d6d5
RUN git clone https://github.com/pycrypto/pycrypto
COPY build.sh *.py $SRC/
WORKDIR pycrypto
5 changes: 3 additions & 2 deletions projects/pyzmq/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM gcr.io/oss-fuzz-base/base-builder-python
RUN pip3 install --upgrade pip cython
RUN git clone https://github.com/zeromq/pyzmq pyzmq
RUN apt-get update && apt-get install -y libzmq3-dev
RUN python -m pip install cython
RUN git clone --depth 1 --branch main https://github.com/zeromq/pyzmq pyzmq
COPY *.sh *py $SRC/
WORKDIR $SRC/pyzmq
5 changes: 3 additions & 2 deletions projects/pyzmq/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
# limitations under the License.
#
################################################################################
pip3 install .
python -m pip install .

for fuzzer in $(find $SRC -name 'fuzz_*.py'); do
compile_python_fuzzer $fuzzer
compile_python_fuzzer $fuzzer --collect-all="pyzmq"
done
3 changes: 0 additions & 3 deletions projects/six/fuzz_six.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,6 @@ def TestOneInput(data):
except (TypeError, UnicodeDecodeError):
pass

six.moves.html_parser.HTMLParser().unescape(
fdp.ConsumeUnicodeNoSurrogates(fdp.ConsumeIntInRange(1, 1024)))


def main():
atheris.instrument_all()
Expand Down

0 comments on commit 93b417e

Please sign in to comment.