#dailyreport #tokenizers #huggingface #rust #gentoo
#ebuild #secops #cargo
I compiled the HF tokenizers library from sources and
enhanced Gentoo ebuild file to allow reproducible
installation from sources.
I removed optional dependencies and disabled HTTP
requirements to enhance security.
I wrote very simple tests for tokenizers, safetensors,
transformers and integration test for them, because
tokenizers require HF hub for testing, that I disabled.
It was a hard but good experience with Cargo package
manager of Rust. The main problems was due to strange
cfg flags that Gentoo should have set automaticly, for
ex. target_os=linux was not set. "cfg" is an
abomination that you can't add change this safely.
I didn't find a working solution to manage "cfg" and, so
I just patched the Cargo.toml files of dependencies by
commenting out lines.
(∠・ω )⌒