Query Lambdas: Increasing Developer Velocity for Application Development

March 12, 2020

,

At Rockset we strive to make building modern data applications easy and intuitive. Data-backed applications come with an inherent amount of complexity - managing the database backend, exposing a data API (often using hard-coded SQL or an ORM to write queries), keeping the data and application code in sync... the list goes on. Just as Rockset has reimagined and dramatically simplified the traditional ETL pipeline on the data-loading side, we’re now proud to release a new product feature - Query Lambdas - that similarly rethinks the data application development workflow.

Application Development on Rockset: Status Quo

The traditional application development workflow on Rockset has looked something the the following:

Step 1: Construct SQL query in the Rockset Console

For this case, let’s use the sample query:

-- select events for a particular user in the last 5 days
SELECT
    `event, event_time`
FROM
    "User-Activity"
WHERE
    userId = '...@rockset.com'
    AND event_time > CURRENT_TIMESTAMP() - DAYS(5)

Step 2: Substitute out hard-coded values or add filters manually using Query Parameters

Let’s say we want to generalize this query to support arbitrary user emails and time durations. The query SQL would look something like this:

-- select events for any particular user in the last X days
SELECT
    event, event_time
FROM
    "User-Activity"
WHERE
    userId = :userId
    AND event_time > CURRENT_TIMESTAMP() - DAYS(:days)

Step 3: Hardcode raw SQL into your application code along with parameter values

For a Node.js app using our Javascript client, this code would look something like:

client.queries
    .query({
      sql: {
        query: `SELECT
      event, event_time
  FROM
      "User-Activity"
  WHERE
      userId = :userId
      AND event_time > CURRENT_TIMESTAMP() - DAYS(:days)`,
      },
      parameters: [
        {
          name: 'userId',
          type: 'string',
          value: '...',
        },
        {
          name: 'days',
          type: 'int',
          value: '5',
        },
      ],
    })
    .then(console.log);

While this simple workflow works well for small applications and POCs, it does not accommodate the more complex software development workflows involved in building production applications. Production applications have stringent performance monitoring and reliability requirements. Making any changes to a live application or the database that serves that application needs to be given the utmost care. Production applications also have stringent security requirements and should prevent bugs like SQL injection bug at all costs. Some of the drawbacks of the above workflow include:

  • Raw SQL in application code: Embedding raw SQL in application code can be difficult — often special escaping is required for certain characters in the SQL. It can even be dangerous, as a developer may not realize the hazards of using string interpolation to customize their query to specific users / use-cases as opposed to Query Parameters and thus create a serious vulnerability.
  • Managing the SQL development / application development lifecycle: Simple queries are easy to build and manage. But as queries get more complex, expertise is usually split between a data team and an application development team. In this existing workflow, it is hard for these two teams to collaborate safely on Rockset - for example, a database administrator might not realize that a collection is actively being queried by an application and delete it. Likewise, a developer may tweak the SQL (for example, selecting an additional field or adding an ORDER BY clause) to better fit the needs of the application and create a 10-100x slowdown without realizing it.
  • Query iteration in application code: Can be tedious — to take advantage of the bells and whistles of our SQL Editor, you have to take the SQL out of the application code, unescape / fill parameters as needed, put it into the SQL editor, iterate, reverse the process to get back into your application and try again. As someone who has built several applications and dashboards backed by Rockset, I know how painful this can be :D
  • Query metrics: Without custom implementation work application-side, there’s no way to understand how a particular query is or is not performing. Each execution, from Rockset’s perspective, is entirely independent of every other execution, and so no stats are aggregated, no alerts or warnings configurable, and any visibility into such topics must be implemented as part of the application itself.

Application / Dashboard Development on Rockset with Query Lambdas

Query Lambdas are named parameterized SQL queries stored in Rockset that can be executed from a dedicated REST endpoint. With Query Lambdas, you can:

  • version-control your queries so that developers can collaborate easily with their data teams and iterate faster
  • avoid querying with raw SQL directly from application code and avoid SQL injection security risks by hitting Query Lambda REST endpoints directly, with query parameters automatically turned into REST parameters
  • write a SQL query, include parameters, create a Query Lambda and simply share a link with another application developer
  • see which queries are being used by production applications and ensure that all updates are handled elegantly
  • organize your queries by workspace similarly to the way you organize your collections
  • create / update / delete Query Lambdas through a REST API for easy integration in CI / CD pipelines

Using the same example as above, the new workflow (using Query Lambdas) looks more like this:

Step 1: Construct SQL query in the Console, using parameters now natively supported

Screen Shot 2020-03-11 at 5.04.09 PM

Step 2: Create a Query Lambda

Screen Shot 2020-03-11 at 5.06.10 PM

Step 3: Use Rockset's SDKs or the REST API to trigger executions of that Query Lambda in your app

Example using Rockset’s Python client library:

from rockset import Client, ParamDict
rs = Client()

qlambda = rs.QueryLambda.retrieve(
    'myQueryLambda',
    version=1,
    workspace='commons')

params = ParamDict()
params['days'] = 5
params['userId'] = '...@rockset.com'

results = qlambda.execute(parameters=params)

Example using REST API directly (using Python’s requests library):

payload = json.loads('''{ 
  "parameters": [
    { "name": "userId", "value": "..." },
    { "name": "days", "value": "5" }
  ] 
}''')
r = requests.post(
  'https://api.rs2.usw2.rockset.com/v1/orgs/self/ws/commons/queries/{queryName}/versions/1',
   json=payload,
   headers={'Authorization': 'ApiKey ...'}
)

Let’s look back at each of the shortcomings of the ‘Status Quo’ workflow and see how Query Lambdas address them:

  • Raw SQL in application code: Raw SQL no longer ever needs to live in application code. No temptation to string interpolate, just a unique identifier (query name and version) and a list of parameters if needed that unambiguously resolve to the saved SQL. Each execution will always fetch fresh results - no caching or staleness to worry about.
  • Managing the SQL development / application development lifecycle: With Query Lambdas, a SQL developer can write the query, include parameters, create a Query Lambda and simply share a link (or even less - the name of the Query Lambda alone will suffice to use the REST API) with an application developer. Database administrators can see for each collection any Query Lambda versions that use that collection and thus ensure that all applications are updated to newer versions before deleting any underlying data.
  • Query iteration in application code: Query iteration and application iteration can be entirely separated. Since each Query Lambda version is immutable (you cannot update its SQL or parameters without also incrementing its version), application functionality will remain constant even as the Query Lambda is updated and tested in staging environments. To switch to a newer or older version, simply increment or decrement the version number in your application code.
  • Query metrics: Since each Query Lambda version has its own API endpoint, Rockset will now automatically maintain certain statistics and metrics for you. To start with, we’re exposing for every version: Last Queried (time), Last Queried (user), Last Error (time), Last Error (error message). More to come soon!

Summary

We’re incredibly excited to announce this feature. This initial release is just the beginning - stay tuned for future Query Lambda related features such as automated execution and alerting, advanced monitoring and reporting, and even referencing Query Lambdas in SQL queries.

As part of this release, we’ve also added a new Query Editor UI, new REST API endpoints and updated SDK clients for all of the languages we support. Happy hacking!

More you’d like to see from us? Send us your thoughts at product[at][rockset.com]