- Getting Started
Quick Start
Sign Up
Sign-up for a Rockset account using your GitHub account, Google account, or email address. During the course of your two-week trial, you will have $300 worth of credits to use however you see fit. No credit card required! Billing is determined by your compute (Virtual Instance size) and storage usage. After your trial ends, you can continue to use Rockset for free by using the FREE Virtual Instance and staying below the 2GB of free storage.
Pro-tip: If you’re using a larger VI for your trial, switch it back to Shared VI when you’re not actively using it to save credit! For more information on Billing, click here.
Create a Collection
Now that you've created an account and understand how to use your $300 free trial credits, let's navigate to the Collections tab of the Rockset Console where we will create your first collection!
A collection in Rockset is a set of Rockset documents. Similar to tables in traditional SQL databases, collections can be queried using SQL, either directly or using Query Lambdas.
In this tutorial, we will create two collections from public datasets hosted on AWS S3:
- The Movies Dataset: Film Releases: A sample of movies and information including their genre, popularity, and revenue.
- The Movies Dataset: Film Ratings: A sample of movie ratings by user.
Both datasets are publicly available here.
Create the Film Releases Collection
Follow the steps below to create the Film Releases collection:
-
Click Create your first Collection in the Collections tab of the Rockset Console:
-
Select Public Datasets as the data source for your collection:
-
Select the dataset The Movies Dataset: Film Releases and click Start below:
-
Configure and create the collection:
- Name your collection
film_releases
in the Collection Name field. - Under the Configure Ingest section, you will see that an Ingest Transformation has already been predefined for you. This transformation simply processes incoming data as it is written into your Rockset collection by cleaning up the data and defining data types for a few fields. For the purposes of this tutorial, we recommend that you do not change the predefined ingest transformation as our queries later on may be affected.
- When you're ready, click Create at the bottom of the page to complete the creation of your collection.
The source preview is automatically generated so you can explore the semi-structured JSON data in a tabulated form:
- Name your collection
The collection creation process will take about 3 minutes to complete. Its initial status will be Created, after which, it will change to Ready. At this point, documents will begin to flow into the collection gradually until the entire dataset is ingested. When completed, you should see something like this:
Note: You may need to refresh the screen for the status to update.
Create the Film Ratings Collection
Now repeat the same steps above to create the Film Ratings collection. This time select the dataset
The Movies Dataset: Film Ratings and name your collection film_ratings
in the Collection Name field.
Execute a Query
Now that both collections have been set up, we can use SQL to query the two collections.
Sample Query
Below is a sample query to suggest movies to a user based on their genre
preference and the movie's rating. Since genre
is an array field (as a single movie may fit
multiple genres), we use UNNEST
to expand this array and create a record for each (genre, movie)
pair. We also exclude movies rated by a specified user.
Follow the steps below to enter and use the query:
-
Navigate to the Query Editor tab of the Rockset Console to start writing and executing SQL queries.
-
Copy the query below into the SQL editing area:
SELECT
m.id,
m.title
FROM
commons.film_releases m,
UNNEST(m.genres) as genres
WHERE
genres.name = 'Action'
AND m.id NOT IN (
SELECT
r.movie_id
FROM
commons.film_ratings r
WHERE
r.user_id = 100
)
ORDER BY
m.popularity DESC;
-
Click Run to execute the query. The Results tab below the query shows the rows returned from the query:
Sample Query with Parameters
In the above query, we used the Action
genre and user 100
. Now, let's make these values parameters that can be
specified at runtime. Follow the steps below to add and test these parameters:
-
Select the Parameters tab below the SQL editing area next to the Results tab, and click Add Parameter to create a new parameter:
-
Populate the parameter details with the following and click Add:
- Set Parameter Name to
genre
. - Set Type to
string
. - Set Parameter Value to
Action
.
- Set Parameter Name to
-
Repeat Step 2 and Step 3 with the following parameter details and click Add:
- Set Parameter name to
user_id
. - Set Type to
int
. - Set Parameter Value to
100
.
- Set Parameter name to
-
Modify the SQL statement from the previous topic to incorporate the parameters created in steps 2 and 3 above:
- Replace
genres.name = 'Action'
withgenres.name = :genre
. - Replace
r.user_id = 100
withr.user_id = :user_id
.
Here is the new SQL statement with these updates:
SELECT
m.id,
m.title
FROM
commons.film_releases m,
UNNEST(m.genres) as genres
WHERE
genres.name = :genre
AND m.id NOT IN (
SELECT
r.movie_id
FROM
commons.film_ratings r
WHERE
r.user_id = :user_id
)
ORDER BY
m.popularity DESC;
- Replace
-
Click Run to execute the query. The Results tab below the query shows the rows returned for the
Action
genre with the User ID100
:
This completes the quickstart tutorial! Now that you’ve got a handle on the basics, the next topic walks through how to Execute a Query from a dedicated REST endpoint.