How To Query The Ethereum Blockchain

March 9, 2023


Blockchain technology has revolutionized the way we store and access data. The decentralized nature of blockchain allows for transparency and immutability, making it an ideal technology for a variety of industries. Originally popularized by Bitcoin in 2009, there have since been a surge in blockchain platforms launched around the world.

The most prominent blockchain platform is the Ethereum blockchain, which in 2021 surpassed Bitcoin to become the most popular blockchain network in the world (as measured by number of transactions). The Ethereum blockchain executes smart contracts to enable developers to build decentralized applications. It’s powered by Ether, the cryptocurrency used as fuel to pay for transactions on the network. In this blog post, we will explore three different ways to query the Ethereum blockchain.

1. Ethereum Clients

The most basic way to access Ethereum blockchain data is by hosting a node yourself on your local computer, and then querying that node directly. This can be done by using an Ethereum execution client (previously known as “Eth1 Clients”), each of which implements a JSON-RPC specification to provide a uniform set of methods for accessing Ethereum blockchain data. The most popular Ethereum clients are Geth, Nethermind, Erigon, and Besu.

Hosting an Ethereum Node using Geth

Here’s a quick example of how you can use Geth to query Ethereum blockchain data:

  1. Install Geth: The first step is to install Geth on your computer. You can download Geth from the Ethereum website and install it according to the instructions for your operating system.
  2. Download the Ethereum blockchain: After installing Geth, run it by opening a terminal window and simply executing the “geth” command. Geth will download the Ethereum blockchain, which can take several hours depending on the speed of your internet connection. You can monitor the progress of the download by checking the log messages in the terminal window.
  3. Start syncing: Once the download is complete, Geth will sync with the Ethereum network. This process involves verifying the blocks in the Ethereum blockchain and updating the state of the Ethereum network. You can monitor the progress of the sync by checking the log messages in the terminal window.

That’s it! Once Geth is fully synced, you can use it to interact with the Ethereum network. For example, you can execute transactions, deploy smart contracts, and query the Ethereum blockchain data. You can do so by issuing JSON-RPC requests directly, but it’s more likely you’ll want to use a library with a more user-friendly interface, such as Web3.js.

Accessing the Ethereum Blockchain with Web3.js

Web3.js is a JavaScript library that is specifically designed to make it easy to interact with Ethereum blockchain nodes. The library leverages JSON-RPC calls to query Ethereum blockchain data, but abstracts it away completely to provide an easy-to-use interface for developers. Web3.js is commonly used in web applications that interact with a blockchain, such as DApps (Decentralized Applications) and blockchain wallets.

// Import the web3.js library
const Web3 = require('web3');

// Connect to a local Ethereum node
const web3 = new Web3(new Web3.providers.HttpProvider(<http://localhost:8545>"));

// Query the current block number

In this example, we are using the Web3 library to connect to a local Ethereum node running on the default JSON-RPC port (8545), but you can always connect to a remote node instead by providing the URL of the node instead of using localhost. You can also use the web3.js library to query other data on the Ethereum blockchain, such as the balance of an Ethereum address or the transaction history of an address:

// Query the balance of an Ethereum address

// Query the transaction history of an Ethereum address

Note that Web3.js is specific to Ethereum and it will not work with other blockchain platforms.

2. RPC Node Providers

An Ethereum RPC node provider is a service that offers access to Ethereum nodes via an API endpoint. This is essentially where a third-party service hosts an Ethereum node, but then provides a user-friendly API to access the network via their hosted node. By accessing an Ethereum node through an RPC API, you can interact with the Ethereum blockchain and execute various operations, such as querying data, sending transactions, and executing smart contracts.

An Ethereum RPC node provider typically hosts a cluster of Ethereum nodes and exposes a JSON-RPC API that developers can use to send requests to the Ethereum nodes. This allows developers to access Ethereum data and execute transactions without having to run their own Ethereum node, which can be complex and resource-intensive. By using an Ethereum RPC node provider, developers can quickly and easily interact with the Ethereum blockchain, making it an attractive option for decentralized applications and other blockchain-based projects.

Here are some of the top Ethereum RPC node providers:

  1. Infura is a popular Ethereum node provider that offers scalable and secure access to the Ethereum blockchain through its managed infrastructure. To use Infura, you just need to sign up for an API key and then use that key in your client library, such as Web3.js.
  2. Alchemy is a platform for building, deploying, and scaling decentralized applications. It provides access to Ethereum nodes and a suite of tools for interacting with the Ethereum blockchain, including a GraphQL API and an Ethereum node management system. With Alchemy, you can access Ethereum data and execute transactions with ease, and its managed infrastructure ensures high reliability and security for your decentralized applications.
  3. QuickNode is a provider of fast and secure Ethereum nodes. They offer a managed infrastructure for interacting with the Ethereum blockchain, and you can use their Ethereum nodes by sending JSON-RPC requests to their API endpoint. With QuickNode, you can quickly and easily access Ethereum data and execute transactions, making it a good choice for decentralized applications and other blockchain-based projects.
  4. Omnia: Omnia is an Ethereum infrastructure provider that offers access to Ethereum nodes and other tools for building and deploying decentralized applications. They provide a secure and scalable infrastructure for interacting with the Ethereum blockchain, and you can use their Ethereum nodes by sending JSON-RPC requests to their API endpoint.

To use any of these Ethereum RPC node providers, you'll need to sign up for an API key and then use that key in your client library to send requests to the Ethereum node. The exact steps for doing this will depend on the client library you're using, but you can find more information and tutorials in the documentation for the node provider and the client library.

3. SQL Queries on Public Datasets

Perhaps the most efficient and simple way to query blockchain data is still by using more traditional methods: extract, transform, and load data from the blockchain into a database, where it is then indexed and made queryable. This method has been made particularly easy by companies like Google Cloud (dataset released in 2018) and Amazon Web Services (dataset released in 2022), who have each released public, actively maintained datasets for both Ethereum and Bitcoin. Anyone can ingest these datasets into a datastore for efficient querying via SQL.

Since these datasets are completely public (the AWS dataset is public and can be downloaded via Amazon S3 at any time), you can load them into the datastore of your choice. You can find step-by-step tutorials for Google BigQuery here and Amazon Athena here. In this blog, we’ll explain how you can query Ethereum blockchain data using Rockset.

Once you sign up for Rockset (where you’ll immediately get $300 in free trial credits), create your first collection using the Public Datasets option as your data source: query_ethereum_image_1

Select the dataset Blockchain Live Data: Ethereum Blocks and click Start below: how_to_query_ethereum_2

Click through the collection creation form, name your collection, and hit the Create button. For this tutorial, we’ll be naming our collection ethereum blocks without modifying any default configurations: how_to_query_ethereum_3

This collection will take around 20-30 minutes to complete its ingestion. During this time, Rockset will download the public dataset from Amazon S3 and ingest it into your Rockset collection, where it will then be automatically indexed in at least three ways using Rockset’s secret sauce: the Converged Index™. Once its status reaches the Ready state, you’re all set to query the Ethereum blockchain using SQL! Here’s a sample query you can run to find out how many new Ethereum blocks were created in the last day:

    commons.ethereum_blocks e

The dataset downloaded and queried above is from AWS’ public dataset on Ethereum blocks, but you can always query other datasets by selecting Amazon S3 as your data source during collection creation, selecting Public Bucket, and providing your own desired S3 path. For example, using the S3 path s3://aws-public-blockchain/v1.0/eth/transactions will allow you to query live Ethereum transaction data, and the S3 path s3://aws-public-blockchain/v1.0/eth/contracts will allow you to query live Ethereum smart contract data. The full folder structure and dataset descriptions for the AWS public blockchain datasets can be found here.

Getting Started with Blockchain Analytics on Rockset

In this blog, we specifically discussed methods for querying data on the Ethereum blockchain, but there are endless possibilities for blockchain analytics on various blockchain networks and datasets. Be sure to check out our blog on 3 Use Cases for Real-Time Blockchain Analytics to learn more about the space, or check out this case study to see how a Web3 startup uses Rockset to power blockchain analytics on an NFT marketplace. Whenever you’re ready to query blockchain data yourself, create your Rockset account and get $300 in free trial credits to get started.