Installation
Install yoink from source or PyPI, including optional extras for Parquet, S3, and JavaScript rendering.
yoink supports Python 3.11+ and runs on Linux, macOS, and Windows.
Standard install
yoink is currently distributed from source on GitHub:
git clone https://github.com/ErikkJs/yoink
cd yoink
pip install -e .If you use Poetry, the project ships a pyproject.toml:
git clone https://github.com/ErikkJs/yoink
cd yoink
poetry installYou can also install directly from the GitHub URL without cloning:
pip install "git+https://github.com/ErikkJs/yoink.git"Optional extras
yoink keeps heavy dependencies behind extras so the core stays lean.
| Extra | Adds | When you need it |
|---|---|---|
parquet | pyarrow | Writing crawl output as columnar Parquet files |
s3 | aioboto3 | Checkpointing to AWS S3 (Lambda, EC2, ECS workloads) |
browser | playwright | Rendering JavaScript-heavy sites and SPAs |
all | All of the above | When you don't want to think about it |
# Install with one extra (from a local clone)
pip install -e ".[parquet]"
# Multiple extras
pip install -e ".[s3,parquet]"
# Everything
pip install -e ".[all]"
# Or from GitHub
pip install "yoink[all] @ git+https://github.com/ErikkJs/yoink.git"Playwright browsers
The browser extra installs the Playwright Python package, but you also need to download the actual browser binaries (Chromium / Firefox / WebKit):
pip install -e ".[browser]"
playwright install chromiumS3 credentials
The s3 extra brings in aioboto3, but the SDK still needs credentials. Any of these work:
# 1. AWS CLI profile (recommended for local development)
aws configure
# 2. Environment variables
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=us-east-1
# 3. IAM role (automatic on EC2 / ECS / Lambda)
# No configuration neededMinimum IAM permissions for the bucket you're checkpointing to:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:HeadObject"],
"Resource": "arn:aws:s3:::your-bucket-name/*"
}
]
}Verify the install
yoink version
# yoink version 0.1.0
# The public data crawler.You're ready. Head to the quickstart.