Skip to content

Latest commit

 

History

History
223 lines (152 loc) · 7.23 KB

README.md

File metadata and controls

223 lines (152 loc) · 7.23 KB

NFT Metadata Scraper

This application scrapes NFT metadata from IPFS using a CSV list of IPFS CIDs and stores the results in a PostgreSQL database. It also hosts an API that allows users to retrieve all stored metadata or a specific row based on a CID.

Features

  • Read a list of IPFS CIDs from a CSV file
  • Fetch metadata for each CID from IPFS
  • Store the name and image fields in a PostgreSQL database
  • Provide an API to retrieve all data or a specific row based on a CID

Prerequisites

  • Go 1.19 or higher
  • PostgreSQL
  • Docker for running the app (optional) and PostgreSQL in a container, if desired
  • jq for handling some JSON configuration (optional, used for GitHub Actions)

🚀 New Features

  • 🐳 Containerization for Local Development: Utilizes Docker Compose for easy setup and teardown of the development environment. See ADR 1.

  • 🚑 Health Check Endpoint: A new /healthcheck endpoint checks database connectivity and returns the application's version, improving load balancer integration. See ADR 10.

  • 🏗️ Terraform Infrastructure: Infrastructure as Code using Terraform for reproducible and scalable cloud environments. See ADR 2 for GitOps and ADR 3 for ECS specifics.

  • 🔐 Secure Database Password Handling: Securely manages database passwords, avoiding plain text exposure. See ADR 9.

  • 🛡️ Distroless Containers: For production, uses distroless containers to minimize attack surfaces. See ADR 5. Not to mention, the final image is super-lightweight at only 7.29MB💨.

  • 🔄 GitOps Workflow: Implements a GitOps workflow for secure and automated infrastructure deployment. See ADR 2.

  • 🔑 Least Privilege Pipeline: Ensures the CI/CD pipeline operates with the least privilege necessary, enhancing security. See ADR 7.

  • 🤖 Machine-Generated Config Files: Simplifies setup and ensures consistency with machine-generated HCL and JSON configuration files managed by Terraform. See ADR 8.

Running with Docker Compose

Step 1: Install Docker Compose

Follow the instructions here to install Docker Compose.

Clone this repository:

git clone git clone https://github.com/dukeofgaming/ipfs-metadata.git

Step 2: Start the Application

Run docker-compose up --build to start the application. This will:

  • Build the application container.
  • Start a PostgreSQL container.
  • Start the application container.

Step 3: Shut down the application

Run docker-compose down --volumes to shut down the application and remove the associated anonymous volumes.

AWS Infrastructure Setup

You will need Terraform for this, which can be installed from the instructions here here.

Setup / Initialize the core state

Quickstart

For full instructions including how to migrate to an S3 backend from an initial run (highly recommended), see the README in the iac/terraform/core directory.

  1. Copy the .env.sh.dist file to .env.sh and fill in the required values, then run:

    source .env.sh
  2. Run terraform

    terraform init
    terraform apply
  3. Copy backend.tf.dist to backend.tf; a backend.hcl should have been generated for you after your first apply, to now enable the S3 backend simply run:

    cp backend.tf.dist backend.tf
    terraform init -backend-config=backend.hcl

If this is not your first run, use terraform init -backend-config=backend.hcl if you're migrating to an S3 backend or using an existing one.

Deploy the app with ECS & RDS

Quickstart

For full instructions including how to migrate to an S3 backend from an initial run (highly recommended), see the README in the iac/terraform/app directory.

  1. Copy the .env.sh.dist file to .env.sh and fill in the required values, then run:

    source .env.sh
  2. After running the core setup, you should see 3 HCL files with backend configuration ready to go, navigate to the iac/terraform/app directory and run:

    chmod +x ./switch-backend.sh
    cp terraform.tfvars.json.dist terraform.tfvars.json
    ./switch-backend.sh dev
  3. Run terraform

    terraform init
    terraform apply

Changing the RDS password

Once you have deployed both projects and got at least one succesful build in Github Actions, you can change the RDS password by following these steps:

  1. Go to your GitHub repository Settings
  2. Navigate to Environments
  3. Click on the environment you want to change the password for.
  4. Look for the RDS_MASTER_PASSWORD secret and click on the "Update" button, enter a new password, then save.
  5. Go to your latest succesful build for that environment and click on the "Re-run jobs" button.
  6. Make sure the ECS task has restarded and you have a new container running.

Running without a container

Step 1: Clone the Repository

git clone https://github.com/shawnwollenberg/ipfs-metadata.git
cd nft_scraper

Step 2: Set Up PostgreSQL

You can either set up PostgreSQL locally or use Docker to run it in a container.

Using Docker:

docker run --name postgres -e POSTGRES_USER=youruser -e POSTGRES_PASSWORD=yourpassword -e POSTGRES_DB=yourdb -p 5432:5432 -d postgres

Step 3: Configure Environment Variables

Create a .env file (or copy and the .env.dist file) in the root directory of the project with the following content:

env

POSTGRES_USER=youruser
POSTGRES_PASSWORD=yourpassword
POSTGRES_DB=yourdb
POSTGRES_HOST=localhost
POSTGRES_PORT=5432

Step 4: Install Dependencies

go mod tidy

Step 5: Prepare the CSV File

A CSV file has been saved in the data directory, but if you would like to add additional CIDs feel free to adjust. Each row should contain one CID.

Step 6: Run the Application

go run .

This will:

  • Read the CSV file.
  • Fetch metadata for each CID from IPFS.
  • Store the name and image fields in the PostgreSQL database.
  • Start the API server.

API Endpoints

Get All Metadata

Request:
GET /metadata
Response:
[
  {
    "cid": "Qm...",
    "name": "Example Name",
    "image": "Example Image URL"
  },
  ...
]

Get Metadata by CID

Request:

GET /metadata/:cid

Response:
{
  "cid": "Qm...",
  "name": "Example Name",
  "image": "Example Image URL"
}

Acknowledgements

  • Gin Gonic for the web framework.
  • sqlx for SQL database interactions.
  • godotenv for loading environment variables from a .env file.

Contact

For any questions or suggestions, please open an issue or contact the repository owner.