Skip to main content

Squid generation tools

Subsquid provides tools for generating ready-to-use squids that index events and function calls of smart contracts. EVM/Solidity and WASM/ink! smart contracts are supported. The tools can be configured to make squids that save data to a PostgreSQL database or to a file-based dataset. All that is required is NodeJS, Subsquid CLI and, if your squid will be using a database, Docker.

Squid generation procedure is very similar for both contract types. Here are the steps:

  1. Create a new blank squid with sqd init using a suitable template:

    # for EVM/Solidity contracts
    sqd init my-squid -t abi
    # OR
    # for WASM/ink! contracts
    sqd init my-squid -t https://github.com/subsquid-labs/squid-ink-abi-template

    Enter the squid folder and install the dependencies:

    cd my-squid
    npm ci
  2. Write the configuration of the future squid to squidgen.yaml. Retrieve any necessary contract ABIs and store them at ./abi.

    Alternatively, skip to the next step and specify the configuration via CLI. Note: some features will not be available.

  3. Generate and build the squid code:

    npx squid-gen config squidgen.yaml
    npm run build

    If you chose to configure the tool via CLI instead, do so now. Here's an example:

    npx squid-gen-abi \
    --address 0x2E645469f354BB4F5c8a05B3b30A929361cf77eC \
    --archive https://v2.archive.subsquid.io/network/ethereum-mainnet \
    --event NewGravatar \
    --event UpdatedGravatar \
    --function '*' \
    --from 6000000

    See npx squid-gen-abi --help for all options.

  4. Prepare your squid for launching. If it is using a database, start a PostgreSQL container, then regenerate and apply migrations:

    docker compose up -d
    npx squid-typeorm-migration generate
    npx squid-typeorm-migration apply

    If it is storing its data to a dataset, strip the project folder of database-related facilities that are no longer needed.

  5. Test the complete squid by running it locally. Start a processor with

    node -r dotenv/config lib/main.js

    If your squid will be serving GraphQL also run npx squid-graphql-server in a separate terminal. Make sure that the squid saves the requested data to its target:

    • if it is serving GraphQL, visit the local GraphiQL playground;
    • for PostgreSQL-based squids you can also connect to the database with PGPASSWORD=postgres psql -U postgres -p 23798 -h localhost squid and take a look at the contents;
    • if it is storing data to a file-based dataset, wait for the first filesystem sync then verify that all the expected files are present and contain the expected data. If your squid produces data at a low rate, you may have to tweak the chunkSizeMb setting and/or add a ctx.store.setForceFlush() call to manually write dataset chunks at appropriate intervals.

At this point your squid is ready. You can run it on your own infrastructure or deploy it to Subsquid Cloud.

Configuration

A valid config for the squid-gen config is a YAML file with the following sections:

  • archive is an endpoint URL of a Subsquid Network gateway. Find an appropriate gateway at the Supported networks page or with sqd gateways.

  • target section describes how the scraped data should be stored. Set

    target:
    type: postgres

    to use a PostgreSQL database that can be presented to users as a GraphQL API or used as-is. Another option is to store the data to a file-based dataset.

  • contracts is a list of contracts to be indexed. Define the following fields for each contract:

    • name
    • address
    • range (optional): block range to be indexed. An object with from and to properties, each of which can be omitted. Defaults to indexing the whole chain.
    • abi (optional on EVM): path to the contract JSON ABI. If omitted for an EVM contract, the tool will attempt to fetch the ABI by address from the Etherscan API or a compatible alternative set by the etherscanApi root option.
    • proxy (EVM-only): when indexing a proxy contract for events or calls defined in the implementation, set this option to its address and the address option to the address of the implementation contract. That way the tool will retrieve the ABI of the implementation and use it to index the output of the proxy.
    • events (optional): a list of events to be indexed or a boolean value. Set to true to index all events defined in the ABI. Defaults to false, meaning that no events are to be indexed.
    • functions (EVM-only, optional): a list of functions the calls of which are to be indexed or a boolean value. Set to true to index calls of all functions defined in the ABI. Defaults to false, meaning that no function calls are to be indexed.
  • etherscanApi (EVM-only, optional): Etherscan API-compatible endpoint to fetch contract ABI by a known address. Default: https://api.etherscan.io/.

file-store targets

Currently the only file-based data target type supported by squid-gen packages is parquet. When used, it requires that the path field is also set alongside type. A path can be a local path or an URL pointing to a folder within a bucket on an S3-compatible cloud service.

Support for file-store is in alpha stage. Known caveats:

  • If a S3 URL is used, then the S3 region, endpoint and user credentials will be taken from the default environment variables. Fill your .env file and/or set your Subsquid Cloud secrets accordingly.

  • Unlike their PostgreSQL-powered equivalents, the squids that use file-store may not write their data often. You may have to configure the chunkSizeMb parameter of the Database class and/or call ctx.store.setForceFlush() when appropriate to strike an acceptable balance between the lag of the indexed data and the number of files in the resulting dataset. See Filesystem store overview for details.

  • For parquet targets, the Decimal(38) column type is used by the code generator to represent uint256. This is done for compatibility reasons: very few tools seem to support reading wider decimals from Parquet files. If you're getting a lot of errors containing value ... does not fit into Decimal(38, 0), consider replacing the Decimal(38) column type with Decimal(78) or String() at src/table.ts.

  • At the moment, squids generated with file-based data targets contain a lot of facilities for managing the database and have to be stripped of them before use.

Strip the squid folder for file-store

Steps to convert a squid made with a database-enabled template for use with file-store:

  1. Remove unneeded files and packages.
rm docker-compose.yml
npm uninstall @subsquid/graphql-server @subsquid/typeorm-migration @subsquid/typeorm-store @subsquid/util-internal-json pg typeorm @subsquid/typeorm-codegen
  1. Replace commands.json with the one from the file-store-parquet-example repo.
curl -o commands.json https://raw.githubusercontent.com/subsquid-labs/file-store-parquet-example/main/commands.json
  1. In squid.yaml, remove the deploy.addons section and replace the deploy.api section with
  api:
cmd: [ "sleep", "3600" ]
  1. Install any required file-store packages.
# if target.type was `parquet`
npm install @subsquid/file-store-parquet
# if target.path was an S3 URL
npm install @subsquid/file-store-s3

Configuration examples

EVM/Solidity

  • Index LiquidationCall events of the AAVE V2 Lending Pool contract, starting from block 11362579 when it was deployed. Save the results to PostgreSQL. Use the ABI located at ./abi/aave-v2-pool.json.

    archive: eth-mainnet
    target:
    type: postgres
    contracts:
    - name: aave-v2-pool
    address: "0x7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9"
    abi: ./abi/aave-v2-pool.json
    range:
    from: 11362579
    events:
    - LiquidationCall
  • Index events and function calls by the DPX contract (a proxy) on Arbitrum, based on the ABI of the implementation contract retrieved from Arbiscan API. Save the results to Parquet files at './data'.

    archive: arbitrum
    target:
    type: parquet
    path: ./data
    contracts:
    - name: dpx
    address: "0x3f770Ac673856F105b586bb393d122721265aD46"
    proxy: "0x6C2C06790b3E3E3c38e12Ee22F8183b37a13EE55"
    events: true
    functions: true
    etherscanApi: https://api.arbiscan.io/

    Note: this example is known to run into the integer length issue described in the file-store targets section. One way to make it work is to widen all Decimal column types from 38 to 78 symbols:

    sed -i -e 's/38/78/g' src/table.ts
  • Index all events and function calls of the Positions NFT and Factory contracts of Uniswap, send the results to the uniswap-data folder of the subsquid-testing-data bucket.

    archive: eth-mainnet
    target:
    type: parquet
    path: s3://subsquid-testing-bucket/uniswap-data
    contracts:
    - name: positions
    address: "0xC36442b4a4522E871399CD717aBDD847Ab11FE88"
    events: true
    functions: true
    - name: factory
    address: "0x1F98431c8aD98523631AE4a59f267346ea31F984"
    events: true
    functions: true

    Notes:

    • This example is also susceptible to the integer length issue and will drop at least two events if used as-is, without widening the column types.
    • The generated squid requires some variables to be set to connect to S3. Here's an example of what .env may look like:
      S3_REGION=us-east-1
      S3_ENDPOINT=https://s3.filebase.com
      S3_ACCESS_KEY_ID=myAccessKeyId
      S3_SECRET_ACCESS_KEY=mySecretAccessKey

WASM/ink!

  • Index Transfer events emitted by a ERC20 contract on Shibuya, save results to PostgreSQL. Do not forget to use the ink-abi template!
    archive: shibuya
    target:
    type: postgres
    contracts:
    - name: testToken
    abi: "./abi/erc20.json"
    address: "0x5207202c27b646ceeb294ce516d4334edafbd771f869215cb070ba51dd7e2c72"
    events:
    - Transfer
    Note: you can get the ABI from the squid-gen repository:
    curl -o abi/erc20.json https://raw.githubusercontent.com/subsquid/squid-gen/master/tests/ink-erc20/abi/erc20.json