Introduction
The intermix.io Python plugin is a library that can be used to add SQL annotations. Adding annotation to your SQL allows you to detect custom query properties in intermix.io and create fine-grained Jobs definitions for bespoke applications.
The plugin works by prepending the query with a SQL comment containing metadata about the query itself. This does not slow down query execution or affect the logical execution of the code. It is used to provide data inside our analytics service.
When to use this
If you are running a python application which is running queries and you want to tag them. For example Pinterest's Pinball or Spotify's Luigi.
Installation
Run
pip install intermix
and add the package to your requirements.txt.
SDK Supported Properties
Field |
Description |
Example |
Required? |
app |
A keyword used to identify your application. This name will appear in the product in aggregations. |
“etl-loader” |
yes |
app_ver |
A version string for your app. |
“1.2” |
yes |
plugin |
The name of the software library that is being used to generate this annotation. |
“intermix-airflow” |
yes |
plugin_ver |
A version for the above. |
“1.0a” |
yes |
dag |
The DAG executing the SQL. This can be e.g. an Airflow DAG, a Pinball Workflow, or a string that identifies the logical collection of tasks that this SQL is part of. |
“load-sales-data” |
yes |
task |
The task that is executing the SQL. ie an Airflow "task" or a Pinball "job". This can be a series of queries executing in a single transaction. |
“load-ny-sales” |
yes |
user |
The user executing the task. |
“joesmith” |
no |
meta |
A sub-object of additional user-defined key-value fields. |
{“department”: “east”, “report_id”: “6”} |
no |
Sample Code
Let's say you have a query select count(*) from users;
that you execute in a batch process. To use the annotation feature you would do the following: