Squid generation tools
Subsquid provides tools for generating ready-to-use squids that index events and function calls of smart contracts. EVM/Solidity and WASM/ink! smart contracts are supported. The tools can be configured to make squids that save data to a PostgreSQL database or to a file-based dataset. All that is required is NodeJS, Subsquid CLI and, if your squid will be using a database, Docker.
Squid generation procedure is very similar for both contract types. Here are the steps:
-
Create a new blank squid with
sqd init
using a suitable template:# for EVM/Solidity contracts
sqd init my-squid -t abi
# OR
# for WASM/ink! contracts
sqd init my-squid -t https://github.com/subsquid-labs/squid-ink-abi-templateEnter the squid folder and install the dependencies:
cd my-squid
npm ci -
Write the configuration of the future squid to
squidgen.yaml
. Retrieve any necessary contract ABIs and store them at./abi
.Alternatively, skip to the next step and specify the configuration via CLI. Note: some features will not be available.
-
Generate and build the squid code:
npx squid-gen config squidgen.yaml
npm run build
If you chose to configure the tool via CLI instead, do so now. Here's an example:
npx squid-gen-abi \
--address 0x2E645469f354BB4F5c8a05B3b30A929361cf77eC \
--archive https://v2.archive.subsquid.io/network/ethereum-mainnet \
--event NewGravatar \
--event UpdatedGravatar \
--function '*' \
--from 6000000See
npx squid-gen-abi --help
for all options. -
Prepare your squid for launching. If it is using a database, start a PostgreSQL container, then regenerate and apply migrations:
docker compose up -d
npx squid-typeorm-migration generate
npx squid-typeorm-migration apply
If it is storing its data to a dataset, strip the project folder of database-related facilities that are no longer needed.
-
Test the complete squid by running it locally. Start a processor with
node -r dotenv/config lib/main.js
If your squid will be serving GraphQL also run
npx squid-graphql-server
in a separate terminal. Make sure that the squid saves the requested data to its target:- if it is serving GraphQL, visit the local GraphiQL playground;
- for PostgreSQL-based squids you can also connect to the database with
PGPASSWORD=postgres psql -U postgres -p 23798 -h localhost squid
and take a look at the contents; - if it is storing data to a file-based dataset, wait for the first filesystem sync then verify that all the expected files are present and contain the expected data. If your squid produces data at a low rate, you may have to tweak the
chunkSizeMb
setting and/or add actx.store.setForceFlush()
call to manually write dataset chunks at appropriate intervals.
At this point your squid is ready. You can run it on your own infrastructure or deploy it to Subsquid Cloud.
Configuration
A valid config for the squid-gen config
is a YAML file with the following sections:
-
archive is an endpoint URL of a Subsquid Network gateway. Find an appropriate gateway at the Supported networks page or with
sqd gateways
. -
target section describes how the scraped data should be stored. Set
target:
type: postgresto use a PostgreSQL database that can be presented to users as a GraphQL API or used as-is. Another option is to store the data to a file-based dataset.
-
contracts is a list of contracts to be indexed. Define the following fields for each contract:
- name
- address
- range (optional): block range to be indexed. An object with
from
andto
properties, each of which can be omitted. Defaults to indexing the whole chain. - abi (optional on EVM): path to the contract JSON ABI. If omitted for an EVM contract, the tool will attempt to fetch the ABI by address from the Etherscan API or a compatible alternative set by the
etherscanApi
root option. - proxy (EVM-only): when indexing a proxy contract for events or calls defined in the implementation, set this option to its address and the
address
option to the address of the implementation contract. That way the tool will retrieve the ABI of the implementation and use it to index the output of the proxy. - events (optional): a list of events to be indexed or a boolean value. Set to
true
to index all events defined in the ABI. Defaults tofalse
, meaning that no events are to be indexed. - functions (EVM-only, optional): a list of functions the calls of which are to be indexed or a boolean value. Set to
true
to index calls of all functions defined in the ABI. Defaults tofalse
, meaning that no function calls are to be indexed.
-
etherscanApi (EVM-only, optional): Etherscan API-compatible endpoint to fetch contract ABI by a known address. Default: https://api.etherscan.io/.
file-store
targets
Currently the only file-based data target type supported by squid-gen
packages is parquet
. When used, it requires that the path
field is also set alongside type
. A path
can be a local path or an URL pointing to a folder within a bucket on an S3-compatible cloud service.
Support for file-store
is in alpha stage. Known caveats:
-
If a S3 URL is used, then the S3 region, endpoint and user credentials will be taken from the default environment variables. Fill your
.env
file and/or set your Subsquid Cloud secrets accordingly. -
Unlike their PostgreSQL-powered equivalents, the squids that use
file-store
may not write their data often. You may have to configure thechunkSizeMb
parameter of theDatabase
class and/or callctx.store.setForceFlush()
when appropriate to strike an acceptable balance between the lag of the indexed data and the number of files in the resulting dataset. See Filesystem store overview for details. -
For
parquet
targets, theDecimal(38)
column type is used by the code generator to representuint256
. This is done for compatibility reasons: very few tools seem to support reading wider decimals from Parquet files. If you're getting a lot of errors containingvalue ... does not fit into Decimal(38, 0)
, consider replacing theDecimal(38)
column type withDecimal(78)
orString()
atsrc/table.ts
. -
At the moment, squids generated with file-based data targets contain a lot of facilities for managing the database and have to be stripped of them before use.
Strip the squid folder for file-store
Steps to convert a squid made with a database-enabled template for use with file-store
:
- Remove unneeded files and packages.
rm docker-compose.yml
npm uninstall @subsquid/graphql-server @subsquid/typeorm-migration @subsquid/typeorm-store @subsquid/util-internal-json pg typeorm @subsquid/typeorm-codegen
- Replace
commands.json
with the one from the file-store-parquet-example repo.
curl -o commands.json https://raw.githubusercontent.com/subsquid-labs/file-store-parquet-example/main/commands.json
- In
squid.yaml
, remove thedeploy.addons
section and replace thedeploy.api
section with
api:
cmd: [ "sleep", "3600" ]
- Install any required
file-store
packages.
# if target.type was `parquet`
npm install @subsquid/file-store-parquet
# if target.path was an S3 URL
npm install @subsquid/file-store-s3
Configuration examples
EVM/Solidity
-
Index
LiquidationCall
events of the AAVE V2 Lending Pool contract, starting from block 11362579 when it was deployed. Save the results to PostgreSQL. Use the ABI located at./abi/aave-v2-pool.json
.archive: eth-mainnet
target:
type: postgres
contracts:
- name: aave-v2-pool
address: "0x7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9"
abi: ./abi/aave-v2-pool.json
range:
from: 11362579
events:
- LiquidationCall -
Index events and function calls by the DPX contract (a proxy) on Arbitrum, based on the ABI of the implementation contract retrieved from Arbiscan API. Save the results to Parquet files at './data'.
archive: arbitrum
target:
type: parquet
path: ./data
contracts:
- name: dpx
address: "0x3f770Ac673856F105b586bb393d122721265aD46"
proxy: "0x6C2C06790b3E3E3c38e12Ee22F8183b37a13EE55"
events: true
functions: true
etherscanApi: https://api.arbiscan.io/Note: this example is known to run into the integer length issue described in the
file-store
targets section. One way to make it work is to widen allDecimal
column types from 38 to 78 symbols:sed -i -e 's/38/78/g' src/table.ts
-
Index all events and function calls of the Positions NFT and Factory contracts of Uniswap, send the results to the
uniswap-data
folder of thesubsquid-testing-data
bucket.archive: eth-mainnet
target:
type: parquet
path: s3://subsquid-testing-bucket/uniswap-data
contracts:
- name: positions
address: "0xC36442b4a4522E871399CD717aBDD847Ab11FE88"
events: true
functions: true
- name: factory
address: "0x1F98431c8aD98523631AE4a59f267346ea31F984"
events: true
functions: trueNotes:
- This example is also susceptible to the integer length issue and will drop at least two events if used as-is, without widening the column types.
- The generated squid requires some variables to be set to connect to S3. Here's an example of what
.env
may look like:S3_REGION=us-east-1
S3_ENDPOINT=https://s3.filebase.com
S3_ACCESS_KEY_ID=myAccessKeyId
S3_SECRET_ACCESS_KEY=mySecretAccessKey
WASM/ink!
- Index
Transfer
events emitted by a ERC20 contract on Shibuya, save results to PostgreSQL. Do not forget to use the ink-abi template!Note: you can get the ABI from thearchive: shibuya
target:
type: postgres
contracts:
- name: testToken
abi: "./abi/erc20.json"
address: "0x5207202c27b646ceeb294ce516d4334edafbd771f869215cb070ba51dd7e2c72"
events:
- Transfersquid-gen
repository:curl -o abi/erc20.json https://raw.githubusercontent.com/subsquid/squid-gen/master/tests/ink-erc20/abi/erc20.json