Rockset
  • Getting Started

Quick Start

Sign Up

Sign-up for a Rockset account using your GitHub account, Google account, or email address. During the course of your two-week trial, you will have $300 worth of credits to use however you see fit. No credit card required! Billing is determined by your compute (Virtual Instance size) and storage usage. After your trial ends, you can continue to use Rockset for free by using the FREE Virtual Instance and staying below the 2GB of free storage.

Pro-tip: If you’re using a larger VI for your trial, switch it back to Shared VI when you’re not actively using it to save credit! For more information on Billing, click here.

Create a Collection

Now that you've created an account and understand how to use your $300 free trial credits, let's navigate to the Collections tab of the Rockset Console where we will create your first collection!

A collection in Rockset is a set of Rockset documents. Similar to tables in traditional SQL databases, collections can be queried using SQL, either directly or using Query Lambdas.

In this tutorial, we will create two collections from public datasets hosted on AWS S3:

  • The Movies Dataset: Film Releases: A sample of movies and information including their genre, popularity, and revenue.
  • The Movies Dataset: Film Ratings: A sample of movie ratings by user.

Both datasets are publicly available here.

Create the Film Releases Collection

Follow the steps below to create the Film Releases collection:

  1. Click Create your first Collection in the Collections tab of the Rockset Console:

    New Collection

  2. Select Public Datasets as the data source for your collection:

    Create Collection

  3. Select the dataset The Movies Dataset: Film Releases and click Start below:

    The Movies Dataset: Film Releases

  4. Configure and create the collection:

    • Name your collection film_releases in the Collection Name field.
    • Under the Configure Ingest section, you will see that an Ingest Transformation has already been predefined for you. This transformation simply processes incoming data as it is written into your Rockset collection by cleaning up the data and defining data types for a few fields. For the purposes of this tutorial, we recommend that you do not change the predefined ingest transformation as our queries later on may be affected.
    • When you're ready, click Create at the bottom of the page to complete the creation of your collection.

    The source preview is automatically generated so you can explore the semi-structured JSON data in a tabulated form:

    The Movies Dataset: Film Releases

The collection creation process will take about 3 minutes to complete. Its initial status will be Created, after which, it will change to Ready. At this point, documents will begin to flow into the collection gradually until the entire dataset is ingested. When completed, you should see something like this:

The Movies Dataset: Film Releases

Note: You may need to refresh the screen for the status to update.

Create the Film Ratings Collection

Now repeat the same steps above to create the Film Ratings collection. This time select the dataset The Movies Dataset: Film Ratings and name your collection film_ratings in the Collection Name field.

The Movies Dataset: Film Ratings

The Movies Dataset: Film Ratings

Execute a Query

Now that both collections have been set up, we can use SQL to query the two collections.

Sample Query

Below is a sample query to suggest movies to a user based on their genre preference and the movie's rating. Since genre is an array field (as a single movie may fit multiple genres), we use UNNEST to expand this array and create a record for each (genre, movie) pair. We also exclude movies rated by a specified user.

Follow the steps below to enter and use the query:

  1. Navigate to the Query Editor tab of the Rockset Console to start writing and executing SQL queries.

  2. Copy the query below into the SQL editing area:

    SELECT
        m.id,
        m.title
    FROM
        commons.film_releases m,
        UNNEST(m.genres) as genres
    WHERE
        genres.name = 'Action'
        AND m.id NOT IN (
            SELECT
                r.movie_id
            FROM
                commons.film_ratings r
            WHERE
                r.user_id = 100
        )
    ORDER BY
        m.popularity DESC;
  3. Click Run to execute the query. The Results tab below the query shows the rows returned from the query:

    Results Tab

Sample Query with Parameters

In the above query, we used the Action genre and user 100. Now, let's make these values parameters that can be specified at runtime. Follow the steps below to add and test these parameters:

  1. Select the Parameters tab below the SQL editing area next to the Results tab, and click Add Parameter to create a new parameter:

    Query Add New Parameter

  2. Populate the parameter details with the following and click Add:

    • Set Parameter Name to genre.
    • Set Type to string.
    • Set Parameter Value to Action.

    Query Add New Parameter1

  3. Repeat Step 2 and Step 3 with the following parameter details and click Add:

    • Set Parameter name to user_id.
    • Set Type to int.
    • Set Parameter Value to 100.
  4. Modify the SQL statement from the previous topic to incorporate the parameters created in steps 2 and 3 above:

    • Replace genres.name = 'Action' with genres.name = :genre.
    • Replace r.user_id = 100 with r.user_id = :user_id.

    Here is the new SQL statement with these updates:

    SELECT
        m.id,
        m.title
    FROM
        commons.film_releases m,
        UNNEST(m.genres) as genres
    WHERE
        genres.name = :genre
        AND m.id NOT IN (
            SELECT
                r.movie_id
            FROM
                commons.film_ratings r
            WHERE
                r.user_id = :user_id
        )
    ORDER BY
        m.popularity DESC;
  5. Click Run to execute the query. The Results tab below the query shows the rows returned for the Action genre with the User ID 100:

    Parameterized Query Results

This completes the quickstart tutorial! Now that you’ve got a handle on the basics, the next topic walks through how to Execute a Query from a dedicated REST endpoint.