> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sqd.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Development flow

> Recommended Squid SDK development workflow — model the schema, write the processor, run migrations, iterate locally, and deploy to SQD Cloud.

This page is a definitive end-to-end guide into practical squid development. It uses templates to simplify the process. Check out [Squid from scratch](/en/sdk/squid-sdk/how-to-start/squid-from-scratch) for a more educational barebones approach.

<Info>
  Feel free to also use the template-specific `sqd` scripts defined in [`commands.json`](/en/sdk/squid-sdk/squid-cli/commands-json) to simplify your workflow. See [sqd CLI cheatsheet](/en/sdk/squid-sdk/how-to-start/cli-cheatsheet) for a short intro.
</Info>

## Prepare the environment

* Node v16.x or newer
* Git
* [Squid CLI](/en/sdk/squid-sdk/squid-cli/installation)
* Docker (if your squid will store its data to PostgreSQL)

See also the [Environment set up](/en/sdk/squid-sdk/how-to-start/development-environment-set-up) page.

## Understand your technical requirements

Consider your business requirements and find out

1. How the data should be delivered. Options:
   * [PostgreSQL](/en/sdk/squid-sdk/resources/persisting-data/typeorm) with an optional [GraphQL API](/en/sdk/squid-sdk/resources/serving-graphql) - can be real-time
   * [file-based dataset](/en/sdk/squid-sdk/resources/persisting-data/file) - local or on S3
   * [Google BigQuery](/en/sdk/squid-sdk/resources/persisting-data/bigquery)
2. What data should be delivered
3. What are the technologies powering the blockchain(s) in question. Supported options:

   * Ethereum Virtual Machine (EVM) chains like [Ethereum](https://ethereum.org) - [supported networks](/en/data/evm)
   * [Substrate](https://substrate.io)-powered chains like [Polkadot](https://polkadot.network) and [Kusama](https://kusama.network) - [supported networks](/en/data/substrate)

   Note that you can use SQD via [RPC ingestion](/en/sdk/squid-sdk/resources/unfinalized-blocks) even if your network is not listed.
4. What exact data should be retrieved from blockchain(s)
5. Whether you need to mix in any [off-chain data](/en/sdk/squid-sdk/resources/external-api)

#### Example requirements

<AccordionGroup>
  <Accordion title="DEX analytics on Polygon">
    Suppose you want to train a prototype ML model on all trades done on Uniswap Polygon since the v3 upgrade.

    1. A delay of a few hours typically won't matter for training, so you may want to deliver the data as files for easier handling.
    2. The output could be a simple list of swaps, listing pair, direction and token amounts for each.
    3. Polygon is an EVM chain.
    4. All the required data is contained within `Swap` events emitted by the pair pool contracts. Uniswap deploys these [dynamically](/en/sdk/squid-sdk/resources/evm/factory-contracts), so you will also have to capture `PoolCreated` events from the factory contract to know which `Swap` events are coming from Uniswap and map them to pairs.
    5. No off-chain data will be necessary for this task.
  </Accordion>

  <Accordion title="NFT ownership on Ethereum">
    Suppose you want to make a website that shows the image and ownership history for ERC721 NFTs from a certain Ethereum contract.

    1. For this application it makes sense to deliver a GraphQL API.
    2. Output data might have `Token`, `Owner` and `Transfer` database tables / [entities](/en/sdk/squid-sdk/reference/schema-file/entities), with e.g. `Token` supplying all the fields necessary to show ownership history and the image.
    3. Ethereum is an EVM chain.
    4. Data on token mints and ownership history can be derived from `Transfer(address,address,uint256)` EVM event logs emitted by the contract. To render images, you will also need token metadata URLs that are only available by [querying the contract state](/en/sdk/squid-sdk/resources/tools/typegen/state-queries) with the `tokenURI(uint256)` function.
    5. You'll need to retrieve the off-chain token metadata (usually from IPFS).
  </Accordion>

  <Accordion title="Kusama transfers BigQuery dataset">
    Suppose you want to create a BigQuery dataset with Kusama native tokens transfers.

    1. The delivery format is BigQuery.
    2. A single table with `from`, `to` and `amount` columns may suffice.
    3. Kusama is a Substrate chain.
    4. The required data is available from `Transfer` events emitted by the `Balances` pallet. Take a look at our [Substrate data sourcing miniguide](/en/sdk/squid-sdk/resources/substrate/data-sourcing-miniguide) for more info on how to figure out which pallets, events and calls are necessary for your task.
    5. No off-chain data will be necessary for this task.
  </Accordion>
</AccordionGroup>

<h2 id="templates">
  Start from a template
</h2>

Although it is possible to [compose a squid from individual packages](/en/sdk/squid-sdk/how-to-start/squid-from-scratch), in practice it is usually easier to start from a template.

<Tabs>
  <Tab title="EVM">
    <AccordionGroup>
      <Accordion title="Templates for the PostgreSQL+GraphQL data destination">
        * A minimal template intended for developing EVM squids. Indexes ETH burns.
          ```bash theme={"system"}
          sqd init my-squid-name -t evm
          ```
        * A starter squid for indexing ERC20 transfers.
          ```bash theme={"system"}
          sqd init my-squid-name -t https://github.com/subsquid-labs/squid-erc20-template
          ```
        * Classic [example Subgraph](https://github.com/graphprotocol/example-subgraph) after a [migration](/en/sdk/squid-sdk/resources/migrate/migrate-subgraph) to SQD.
          ```bash theme={"system"}
          sqd init my-squid-name -t gravatar
          ```
        * A template showing how to [combine data from multiple chains](/en/sdk/squid-sdk/resources/multichain). Indexes USDC transfers on Ethereum and Binance.
          ```bash theme={"system"}
          sqd init my-squid-name -t multichain
          ```
      </Accordion>

      <Accordion title="Templates for storing data in files">
        * USDC transfers -> local CSV
          ```bash theme={"system"}
          sqd init my-squid-name -t https://github.com/subsquid-labs/file-store-csv-example
          ```
        * USDC transfers -> local Parquet
          ```bash theme={"system"}
          sqd init my-squid-name -t https://github.com/subsquid-labs/file-store-parquet-example
          ```
        * USDC transfers -> CSV on S3
          ```bash theme={"system"}
          sqd init my-squid-name -t https://github.com/subsquid-labs/file-store-s3-example
          ```
      </Accordion>

      <Accordion title="Templates for the Google BigQuery data destination">
        * USDC transfers -> BigQuery dataset
          ```bash theme={"system"}
          sqd init my-squid-name -t https://github.com/subsquid-labs/squid-bigquery-example
          ```
      </Accordion>
    </AccordionGroup>
  </Tab>

  <Tab title="Substrate">
    <Accordion title="Templates for the PostgreSQL+GraphQL data destination">
      * Native events emitted by Substrate-based chains
        ```bash theme={"system"}
        sqd init my-squid-name -t substrate
        ```
      * ink! smart contracts
        ```bash theme={"system"}
        sqd init my-squid-name -t ink
        ```
      * Frontier EVM contracts on Astar and Moonbeam
        ```bash theme={"system"}
        sqd init my-squid-name -t frontier-evm
        ```
    </Accordion>
  </Tab>
</Tabs>

After retrieving the template of choice install its dependencies:

```bash theme={"system"}
cd my-squid-name
npm i
```

Test the template locally. The procedure varies depending on the data sink:

<Tabs>
  <Tab title="PostgreSQL+GraphQL">
    1. Launch a PostgreSQL container with
       ```bash theme={"system"}
       docker compose up -d
       ```

    2. Build the squid with
       ```bash theme={"system"}
       npm run build
       ```

    3. Apply the DB migrations with
       ```bash theme={"system"}
       npx squid-typeorm-migration apply
       ```

    4. Start the squid processor with
       ```bash theme={"system"}
       node -r dotenv/config lib/main.js
       ```
       You should see output that contains lines like these ones:
       ```bash theme={"system"}
       04:11:24 INFO  sqd:processor processing blocks from 6000000
       04:11:24 INFO  sqd:processor using archive data source
       04:11:24 INFO  sqd:processor prometheus metrics are served at port 45829
       04:11:27 INFO  sqd:processor 6051219 / 18079056, rate: 16781 blocks/sec, mapping: 770 blocks/sec, 544 items/sec, eta: 12m
       ```

    5. Start the GraphQL server by running
       ```bash theme={"system"}
       npx squid-graphql-server
       ```
       in a separate terminal, then visit the [GraphiQL console](http://localhost:4350/graphql) to verify that the GraphQL API is up.

    When done, shut down and erase your database with `docker compose down`.
  </Tab>

  <Tab title="filesystem dataset">
    1. (for the S3 template only) Set the credentials and prepare a bucket for your data as described in the [template README](https://github.com/subsquid-labs/file-store-s3-example/blob/main/README.md).

    2. Build the squid with
       ```bash theme={"system"}
       npm run build
       ```

    3. Start the squid processor with
       ```bash theme={"system"}
       node -r dotenv/config lib/main.js
       ```
       The output should contain lines like these ones:
       ```bash theme={"system"}
       04:11:24 INFO  sqd:processor processing blocks from 6000000
       04:11:24 INFO  sqd:processor using archive data source
       04:11:24 INFO  sqd:processor prometheus metrics are served at port 45829
       04:11:27 INFO  sqd:processor 6051219 / 18079056, rate: 16781 blocks/sec, mapping: 770 blocks/sec, 544 items/sec, eta: 12m
       ```
       You should see a `./data` folder populated with indexer data appear in a bit. A local folder looks like this:
       ```bash theme={"system"}
       $ tree ./data/
       ./data/
       ├── 0000000000-0007242369
       │   └── transfers.tsv
       ├── 0007242370-0007638609
       │   └── transfers.tsv
       ...
       └── status.txt
       ```
  </Tab>

  <Tab title="BigQuery">
    Create a dataset with your BigQuery account, then follow the [template README](https://github.com/subsquid-labs/squid-bigquery-example/blob/master/README.md).
  </Tab>
</Tabs>

<h2 id="bottom-up-development">
  The bottom-up development cycle
</h2>

The advantage of this approach is that the code remains buildable at all times, making it easier to catch issues early.

<h3 id="typegen">
  I. Regenerate the task-specific utilities
</h3>

<Tabs>
  <Tab title="EVM">
    Retrieve JSON ABIs for all contracts of interest (e.g. from Etherscan), taking care to get ABIs for implementation contracts and not [proxies](/en/sdk/squid-sdk/resources/evm/proxy-contracts) where appropriate. Assuming that you saved the ABI files to `./abi`, you can then regenerate the utilities with

    ```bash theme={"system"}
    npx squid-evm-typegen ./src/abi ./abi/*.json --multicall
    ```

    Or if you would like the tool to retrieve the ABI from Etherscan in your stead, you can run e.g.

    ```bash theme={"system"}
    npx squid-evm-typegen \
      src/abi \
      0xdAC17F958D2ee523a2206206994597C13D831ec7#usdt
    ```

    The utility classes will become available at `src/abi`.

    See also [EVM typegen code generation](/en/sdk/squid-sdk/resources/tools/typegen/generation).
  </Tab>

  <Tab title="Substrate">
    Follow the respective reference configuration pages of each typegen tool:

    * [Substrate typegen configuration](/en/sdk/squid-sdk/resources/tools/typegen/generation)
    * [ink! typegen configuration](/en/sdk/squid-sdk/resources/tools/typegen/generation)

    <Accordion title="A tip for `frontier-evm` squids">
      These squids use both Substrate typegen *and* EVM typegen. To generate all the required utilities, [configure the Substrate part](/en/sdk/squid-sdk/resources/tools/typegen/generation), then save all relevant JSON ABIs to `./abi`, then run

      ```bash theme={"system"}
      npx squid-evm-typegen ./src/abi ./abi/*.json --multicall
      ```

      followed by

      ```bash theme={"system"}
      npx squid-substrate-typegen ./typegen.json
      ```
    </Accordion>
  </Tab>
</Tabs>

<h3 id="processor-config">
  II. Configure the data requests
</h3>

Data requests are [customarily](/en/sdk/squid-sdk/how-to-start/layout) defined at `src/processor.ts`. The details depend on the network type:

<Tabs>
  <Tab title="EVM">
    Edit the definition of `const processor` to

    1. Use a data source appropriate for your chain and task.
       * It is possible to [use RPC](/en/sdk/squid-sdk/reference/processors/evm-batch/general#set-rpc-endpoint) as the only data source, but [adding](/en/sdk/squid-sdk/reference/processors/evm-batch/general#set-gateway) a [SQD Network](/en/data/evm) data source will make your squid sync much faster.
       * RPC is a hard requirement if you're building a real-time API.
       * If you're using RPC as one of your data sources, make sure to [set the number of finality confirmations](/en/sdk/squid-sdk/reference/processors/evm-batch/general#set-finality-confirmation) so that [hot blocks ingestion](/en/sdk/squid-sdk/resources/unfinalized-blocks) works properly.

    2. Request all [event logs](/en/sdk/squid-sdk/reference/processors/evm-batch/logs), [transactions](/en/sdk/squid-sdk/reference/processors/evm-batch/transactions), [execution traces](/en/sdk/squid-sdk/reference/processors/evm-batch/traces) and [state diffs](/en/sdk/squid-sdk/reference/processors/evm-batch/state-diffs) that your task requires, with any necessary related data (e.g. parent transactions for event logs).

    3. [Select all data fields](/en/sdk/squid-sdk/reference/processors/evm-batch/field-selection) necessary for your task (e.g. `gasUsed` for transactions).

    See [reference documentation](/en/sdk/squid-sdk/reference/processors/evm-batch) for more info and [processor configuration showcase](/en/sdk/squid-sdk/examples) for a representative set of examples.
  </Tab>

  <Tab title="Substrate">
    Edit the definition of `const processor` to

    1. Use a data source appropriate for your chain and task
       * [Use](/en/sdk/squid-sdk/reference/processors/substrate-batch/general#set-gateway) a [SQD Network gateway](/en/data/substrate) whenever it is available. [RPC](/en/sdk/squid-sdk/reference/processors/evm-batch/general#set-rpc-endpoint) is still required in this case.
       * For networks without a gateway use just the RPC.

    2. Request all [events](/en/sdk/squid-sdk/reference/processors/substrate-batch/data-requests#events) and [calls](/en/sdk/squid-sdk/reference/processors/substrate-batch/data-requests#calls) that your task requires, with any necessary related data (e.g. parent extrinsics).

    3. If your squid indexes any of the following:

       * an [ink! contract](/en/sdk/squid-sdk/resources/substrate/ink)
       * an EVM contract running on the [Frontier EVM pallet](/en/sdk/squid-sdk/resources/substrate/frontier-evm)
       * [Gear messages](/en/sdk/squid-sdk/resources/substrate/gear)

       then you can use some of the [specialized data requesting methods](/en/sdk/squid-sdk/reference/processors/substrate-batch/data-requests#specialized-setters) to retrieve data more selectively.

    4. [Select all data fields](/en/sdk/squid-sdk/reference/processors/substrate-batch/field-selection) necessary for your task (e.g. `fee` for extrinsics).

    See [reference documentation](/en/sdk/squid-sdk/reference/processors/substrate-batch) for more info. Processor config examples can be found in the tutorials:

    * [general Substrate](/en/sdk/squid-sdk/tutorials/substrate)

    * [ink!](/en/sdk/squid-sdk/tutorials/ink)

    * [Frontier EVM](/en/sdk/squid-sdk/tutorials/frontier-evm)
  </Tab>
</Tabs>

<h3 id="batch-handler-decoding">
  III. Decode and normalize the data
</h3>

Next, change the batch handler to decode and normalize your data.

In templates, the batch handler is defined at the [`processor.run()`](/en/sdk/squid-sdk/reference/processors/architecture#processorrun) call in `src/main.ts` as an inline function. Its sole argument `ctx` contains:

* at `ctx.blocks`: all the requested data for a batch of blocks
* at `ctx.store`: the means to save the processed data
* at `ctx.log`: a [`Logger`](/en/sdk/squid-sdk/reference/logger)
* at `ctx.isHead`: a boolean indicating whether the batch is at the current chain head
* at `ctx._chain`: the means to access RPC for [state calls](#external-data)

This structure ([reference](/en/sdk/squid-sdk/reference/processors/architecture#batch-context)) is common for all processors; the structure of `ctx.blocks` items varies.

<Tabs>
  <Tab title="EVM">
    Each item in `ctx.blocks` contains the data for the requested logs, transactions, traces and state diffs for a particular block, plus some info on the block itself. See [EVM batch context reference](/en/sdk/squid-sdk/reference/processors/evm-batch/context-interfaces).

    Use the `.decode` methods from the [contract ABI utilities](#typegen) to decode events and transactions, e.g.

    ```ts theme={"system"}
    import * as erc20abi from './abi/erc20'

    processor.run(db, async ctx => {
      for (let block of ctx.blocks) {
        for (let log of block.logs) {
          if (log.topics[0]===erc20abi.events.Transfer.topic) {
            let {from, to, value} = erc20.events.Transfer.decode(log)
          }
        }
      }
    })
    ```

    See also the [EVM data decoding](/en/sdk/squid-sdk/resources/tools/typegen/decoding).
  </Tab>

  <Tab title="Substrate">
    Each item in `ctx.blocks` contains the data for the requested events, calls and, if requested, any related extrinsics; it also has some info on the block itself. See [Substrate batch context reference](/en/sdk/squid-sdk/reference/processors/substrate-batch/context-interfaces).

    Use the `.is()` and `.decode()` functions to decode the data for each runtime version, e.g. like this:

    ```ts theme={"system"}
    import {events} from './types'

    processor.run(db, async ctx => {
      for (let block of ctx.blocks) {
        for (let event of block.events) {
          if (event.name == events.balances.transfer.name) {
            let rec: {from: string; to: string; amount: bigint}
            if (events.balances.transfer.v1020.is(event)) {
              let [from, to, amount] = events.balances.transfer.v1020.decode(event)
              rec = {from, to, amount}
            }
            else if (events.balances.transfer.v1050.is(event)) {
              let [from, to, amount] = events.balances.transfer.v1050.decode(event)
              rec = {from, to, amount}
            }
            else if (events.balances.transfer.v9130.is(event)) {
              rec = events.balances.transfer.v9130.decode(event)
            }
            else {
              throw new Error('Unsupported spec')
            }
          }
        }
      }
    })

    ```

    See also the [Substrate data decoding](/en/sdk/squid-sdk/resources/tools/typegen/decoding).

    You can also decode the data of certain pallet-specific events and transactions with specialized tools:

    * use the utility classes made with `@substrate/squid-ink-typegen` to [decode events emitted by ink! contracts](/en/sdk/squid-sdk/resources/tools/typegen/decoding)
    * use the [`@subsquid/frontier` utils](/en/sdk/squid-sdk/reference/frontier) and the [EVM typegen](/en/sdk/squid-sdk/resources/tools/typegen/decoding) to decode event logs and transactions of EVM contracts
  </Tab>
</Tabs>

<h3 id="external-data">
  (Optional) IV. Mix in external data and chain state calls output
</h3>

If you need external (i.e. non-blockchain) data in your transformation, take a look at the [External APIs and IPFS](/en/sdk/squid-sdk/resources/external-api) page.

If any of the on-chain data you need is unavalable from the processor or incovenient to retrieve with it, you have an option to get it via [direct chain queries](/en/sdk/squid-sdk/resources/tools/typegen/state-queries).

<h3 id="store">
  V. Prepare the store
</h3>

At `src/main.ts`, change the [`Database`](/en/sdk/squid-sdk/resources/persisting-data/overview) object definition to accept your output data. The methods for saving data will be exposed by `ctx.store` within the [batch handler](/en/sdk/squid-sdk/reference/processors/architecture).

<Tabs>
  <Tab title="PostgreSQL+GraphQL">
    1. Define the schema of the database (and the [core schema of the OpenReader GraphQL API](/en/sdk/squid-sdk/reference/openreader-server/api) if it is used) at [`schema.graphql`](/en/sdk/squid-sdk/reference/schema-file).

    2. Regenerate the TypeORM model classes with
       ```bash theme={"system"}
       npx squid-typeorm-codegen
       ```
       The classes will become available at `src/model`.

    3. Compile the models code with
       ```bash theme={"system"}
       npm run build
       ```

    4. Ensure that the squid has access to a blank database. The easiest way to do so is to start PostgreSQL in a Docker container with

       ```bash theme={"system"}
       docker compose up -d
       ```

       If the container is running, stop it and erase the database with

       ```bash theme={"system"}
       docker compose down
       ```

       before issuing an `docker compose up -d`.

       The alternative is to connect to an external database. See [this section](/en/sdk/squid-sdk/reference/store/typeorm#database-connection-parameters) to learn how to specify the connection parameters.

    5. Regenerate a migration with
       ```bash theme={"system"}
       rm -r db/migrations
       ```
       ```bash theme={"system"}
       npx squid-typeorm-migration generate
       ```

    You can now use the async functions [`ctx.store.upsert()`](/en/sdk/squid-sdk/reference/store/typeorm#upsert) and [`ctx.store.insert()`](/en/sdk/squid-sdk/reference/store/typeorm#insert), as well as various [TypeORM lookup methods](/en/sdk/squid-sdk/reference/store/typeorm#typeorm-methods) to access the database.

    See the `typeorm-store` [guide](/en/sdk/squid-sdk/resources/persisting-data/typeorm) and [reference](/en/sdk/squid-sdk/reference/store/typeorm) for more info.
  </Tab>

  <Tab title="filesystem dataset">
    Filesystem dataset writing, as performed by the `@subsquid/file-store` package and its extensions, stores the data into one or more flat tables. The exact table definition format depends on the output file format.

    1. Decide on the file format you're going to use:

       * [Parquet](/en/sdk/squid-sdk/reference/store/file/parquet)
       * [CSV](/en/sdk/squid-sdk/reference/store/file/csv)
       * [JSON/JSONL](/en/sdk/squid-sdk/reference/store/file/json)

       If your template does not have any of the necessary packages, install them.

    2. Define any tables you need at the `tables` field of the `Database` constructor argument:
       ```ts theme={"system"}
       import { Database } from '@subsquid/file-store'

       const dbOptions = {
         tables: {
           FirstTable: new Table(/* ... */),
           SecondTable: new Table(/* ... */),
           // ...
         },
         // ...
       }

       processor.run(new Database(dbOptions), async ctx => { // ...
       ```

    3. Define the destination filesystem via the `dest` field of the `Database` constructor argument. Options:
       * local folder - use `LocalDest` from `@subsquid/file-store`
       * S3-compatible file storage service - install `@subsquid/file-store-s3` and use [`S3Dest`](/en/sdk/squid-sdk/reference/store/file/s3-dest)

    Once you're done you'll be able to enqueue data rows for saving using the `write()` and `writeMany()` methods of the context store-provided table objects:

    ```ts theme={"system"}
    ctx.store.FirstTable.writeMany(/* ... */)
    ctx.store.SecondTable.write(/* ... */)
    ```

    The store will write the files automatically as soon as the buffer reaches the size set by the `chunkSizeMb` field of the `Database` constructor argument, or at the end of the batch if a call to [`setForceFlush()`](/en/sdk/squid-sdk/resources/persisting-data/file#setforceflush) was made anywhere in the batch handler.

    See the `file-store` [guide](/en/sdk/squid-sdk/resources/persisting-data/file) and the [reference pages of its extensions](/en/sdk/squid-sdk/reference/store/file).
  </Tab>

  <Tab title="BigQuery">
    Follow the [guide](/en/sdk/squid-sdk/resources/persisting-data/bigquery).
  </Tab>
</Tabs>

<h3 id="batch-handler-persistence">
  VI. Persist the transformed data to your data sink
</h3>

Once your data is [decoded](#batch-handler-decoding), optionally [enriched with external data](#external-data) and transformed the way you need it to be, it is time to save it.

<Tabs>
  <Tab title="PostgreSQL+GraphQL">
    For each batch, create all the instances of all TypeORM model classes at once, then save them with the minimal number of calls to `upsert()` or `insert()`, e.g.:

    ```ts theme={"system"}
    import { EntityA, EntityB } from './model'

    processor.run(new TypeormDatabase(), async ctx => {
      const aEntities: Map<string, EntityA> = new Map() // id -> entity instance
      const bEntities: EntityB = []

      for (let block of ctx.blocks) {
        // fill the containets aEntities and bEntities
      }

      await ctx.store.upsert([...aEntities.values()])
      await ctx.store.insert(bEntities)
    })
    ```

    It will often make sense to keep the entity instances in maps rather than arrays to make it easier to reuse them when defining instances of other entities with [relations](/en/sdk/squid-sdk/reference/schema-file/entity-relations) to the previous ones. The process is described in more detail in the [step 2 of the BAYC tutorial](/en/sdk/squid-sdk/tutorials/bayc/step-two-deriving-owners-and-tokens).

    If you perform any [database lookups](/en/sdk/squid-sdk/reference/store/typeorm#typeorm-methods), try to do so in batches and make sure that the entity fields that you're searching over are [indexed](/en/sdk/squid-sdk/reference/schema-file/indexes-and-constraints).

    See also the [patterns](/en/sdk/squid-sdk/resources/batch-processing#patterns) and [anti-pattens](/en/sdk/squid-sdk/resources/batch-processing#anti-patterns) sections of the Batch processing guide.
  </Tab>

  <Tab title="filesystem dataset">
    You can enqueue the transformed data for writing whenever convenient without any sizeable impact on performance.

    At low output data rates (e.g. if your entire dataset is in tens of Mbytes or under) take care to call [`ctx.store.setForceFlush()`](/en/sdk/squid-sdk/resources/persisting-data/file#setforceflush) when appropriate to make sure your data actually gets written.
  </Tab>

  <Tab title="BigQuery">
    You can enqueue the transformed data for writing whenever convenient without any sizeable impact on performance. The actual data writing will happen automatically at the end of each batch.
  </Tab>
</Tabs>

## The top-down development cycle

The [bottom-up development cycle](#bottom-up-development) described above is convenient for inital squid development and for trying out new things, but it has the disadvantage of not having the means of saving the data ready at hand when initially writing the data decoding/transformation code. That makes it necessary to come back to that code later, which is somewhat inconvenient e.g. when adding new squid features incrementally.

The alternative is to do the same steps in a different order:

1. [Update the store](#store)
2. If necessary, [regenerate the utility classes](#typegen)
3. [Update the processor configuration](#processor-config)
4. [Decode and normalize the added data](#batch-handler-decoding)
5. [Retrieve any external data](#external-data) if necessary
6. [Add the persistence code for the transformed data](#batch-handler-persistence)

## GraphQL options

[Store your data to PostgreSQL](/en/sdk/squid-sdk/resources/persisting-data/typeorm), then consult [Serving GraphQL](/en/sdk/squid-sdk/resources/serving-graphql) for options.

## Scaling up

If you're developing a large squid, make sure to use [batch processing](/en/sdk/squid-sdk/resources/batch-processing) throughout your code.

A common mistake is to make handlers for individual event logs or transactions; for updates that require data retrieval that results in lots of small database lookups and ultimately in poor syncing performance. Collect all the relevant data and process it at once. A simple architecture of that type is discussed in the [BAYC tutorial](/en/sdk/squid-sdk/tutorials/bayc).

You should also check the [Cloud best practices page](/en/cloud/resources/best-practices) even if you're not planning to deploy to [SQD Cloud](/en/cloud) - it contains valuable performance-related tips.

Many issues commonly arising when developing larger squids are addressed by the third party [`@belopash/typeorm-store` package](https://github.com/belopash/squid-typeorm-store). Consider using it.

For complete examples of complex squids take a look at the [Giant Squid Explorer](https://github.com/subsquid-labs/giant-squid-explorer) and [Thena Squid](https://github.com/subsquid-labs/thena-squid) repos.

## Next steps

* Learn about [batch processing](/en/sdk/squid-sdk/resources/batch-processing).
* Learn how squid deal with [unfinalized blocks](/en/sdk/squid-sdk/resources/unfinalized-blocks).
* [Use external APIs and IPFS](/en/sdk/squid-sdk/resources/external-api) in your squid.
* See how squid should be set up for the [multichain setting](/en/sdk/squid-sdk/resources/multichain).
* Deploy your squid [on own infrastructure](/en/sdk/squid-sdk/resources/self-hosting) or to [SQD Cloud](/en/cloud).
